Releasing Flan-T5xxl_TE-only in FP32, FP16, and GGUF Formats!


- Flan-T5xxl is a new-generation text encoder.
- TE-only extracts only the parts needed for image generation.
- GGUF format is an even lighter model.
Introduction
Hello, this is Easygoing.
Today, I’m excited to introduce the release of the FP32, FP16, and GGUF formats of Flan-T5xxl_TE-only.
What is Flan-T5xxl?
T5 is an AI model for text-to-text generation released by Google in 2020.
Model Name | Size (FP32 Format) | Number of Parameters |
---|---|---|
T5-Small | 0.24 GB | 60 million |
T5-Base | 0.88 GB | 220 million |
T5-Large | 3.08 GB | 770 million |
T5-XL | 11.2 GB | 2.8 billion |
T5-XXL | 44 GB | 11 billion |
The T5xxl model is the largest-parameter model in the T5 family and is used as the text encoder for parsing prompts in Flux.1 and SD 3.5.
Model Name | Release | Parameters | Token | Comprehensible Text |
---|---|---|---|---|
T5xxl | October 2020 | 11 billion | 32,000 | Long Sentences & Context |
T5xxl v1.1 | June 2021 | 11 billion | 32,000 | Long Sentences & Context |
Flan-T5xxl | October 2022 | 11 billion | 32,000 | Long Sentences & Context |
Flan-T5xxl is a fine-tuned version of T5xxl, further improving its accuracy. Using the Flan-T5xxl model in image generation AI can lead to improved prompt comprehension and enhanced image quality.
Model List
On the Hugging Face Flan-T5xxl release page, the following models are available:

Model | Size | SSIM Similality | Reccomend |
---|---|---|---|
FP32 | 19 GB | 100.0 % | 🔺 |
FP16 | 9.6 GB | 98.0 % | ✅ |
FP8 | 4.8 GB | 95.3 % | 🔺 |
Q8_0 | 6 GB | 97.6 % | ✅ |
Q6_K | 4.9 GB | 97.3 % | 🔺 |
Q5_K_M | 4.3 GB | 94.8 % | |
Q4_K_M | 3.7 GB | 96.4 % |

Flan-T5xxl_TE-only is a lightweight model that extracts only the text encoder portion used in image generation AI.
GGUF format is a further lightweight model created by quantization compression based on Flan-T5xxl_TE-only_FP32.
The number in GGUF’s Q8 represents the bit count; smaller numbers mean lighter file sizes but lower accuracy.
For image generation with Flan-T5xxl, I recommend using Q6_K or higher.
How to Use Flan-T5xxl!
Place the downloaded files in one of the following folders:
- Installation folder/models/text_encoder
- Installation folder/models/clip
- Installation folder/Models/CLIP
ComfyUI
When using Flux.1 in ComfyUI, load the text encoder using the DualCLIPLoader node.

As of April 13, 2025, the default DualClipLoader node now includes a device selection option, allowing you to choose where to load the model.
- cuda → VRAM
- cpu → System RAM
Since Flux.1’s text encoder has a large capacity, in most cases, setting the device to cpu and storing the model in system RAM will improve performance.
Unless your system RAM is 16GB or less, storing the model in system RAM is more effective than lightweighting with GGUF, so there is currently little advantage to using GGUF format in ComfyUI.
When running Flux.1 in ComfyUI, use the FP16 format text encoder.
In ComfyUI, you can also use the FP32 format text encoder by enabling the --fp32-text-enc setting at startup.
Stable Diffusion WebUI Forge
In Stable Diffusion WebUI Forge, select the Flan-T5xxl model instead of the usual T5xxl_v1_1.

To use the FP32 format text encoder in Stable Diffusion WebUI Forge, start with the --clip-in-fp32 option.
Use the Improved CLIP-L Too!
For Flux.1 and SD 3.5, in addition to T5xxl, you can upgrade CLIP-L to a higher-accuracy version.
- LongCLIP-SAE-ViT-L-14 Model (ComfyUI only)
- CLIP-SAE-ViT-L-14 Model
Since Flan-T5xxl and CLIP-L serve different functions, upgrading both can further improve image quality.
CLIP-L is lighter than T5xxl, so I highly recommend trying this upgrade as well.
Claude and Grok’s Coding is Amazing!
I don’t write code myself, so all the code used for this conversion was written by AI.
The two AIs I used are:
- Claude 3.7 (Released February 25, 2025): Highly accurate code
- Grok3 (Released February 15, 2025): Can search the web for the latest information and handle multiple file uploads
I tried the same conversion two months ago using ChatGPT, but the code didn’t work, and it failed.

This time, I had Claude 3.7 write the initial code, then uploaded error messages to Grok 3 for iterative fixes, which greatly improved efficiency and allowed me to complete the model.
If you’re using AI to write code, I highly recommend trying this approach!
Conclusion: Try Flan-T5xxl
- Flan-T5xxl is a new-generation text encoder.
- TE-only extracts only the necessary components.
- GGUF format is a further lightweight model.
The biggest appeal of local image generation is the freedom to use high-quality models shared globally for free.
Flux.1 and SD 3.5 can generate high-quality illustrations, but upgrading the text encoder can further enhance image quality.

With the release of the TE-only model, the storage size disadvantage of using Flan-T5xxl is gone, so why not give this upgrade a try?
Thank you for reading to the end!
Update History
April 20, 2025
Added details on using FP32 format in Stable Diffusion WebUI Forge.
April 15, 2025
Revised content to reflect ComfyUI updates.
April 2, 2025
Added a link to the ComfyUI-MultiGPU guide.
March 20, 2025
Updated the Flan-T5xxl model list and table.
March 15, 2025
Added the Civitai link.
March 10, 2025
Added a sample workflow for Flux.1_MultiGPU.