Releasing Flan-T5xxl_TE-only in FP32, FP16, and GGUF Formats!


- Flan-T5xxl is a new-generation text encoder.
- TE-only extracts only the parts needed for image generation.
- GGUF format is an even lighter model.
Introduction
Hello, I’m Easygoing.
This time, I’ve released the FP32, FP16, and GGUF formats of Flan-T5xxl_TE-only, and I’d like to introduce them to you.
What Is Flan-T5xxl?
T5 is a text-to-text generation AI model released by Google in 2020.
Model Name | Size (FP32 Format) | Number of Parameters |
---|---|---|
T5-Small | 0.24 GB | 60 million |
T5-Base | 0.88 GB | 220 million |
T5-Large | 3.08 GB | 770 million |
T5-XL | 11.2 GB | 2.8 billion |
T5-XXL | 44 GB | 11 billion |
The T5xxl model is the version of T5 with the largest number of parameters and is used as the text encoder for image generation AIs like Flux.1 and SD 3.5.
Model Name | Release | Parameters | Token | Comprehensible Text |
---|---|---|---|---|
T5xxl | October 2020 | 11 billion | 32,000 | Long Sentences & Context |
T5xxl v1.1 | June 2021 | 11 billion | 32,000 | Long Sentences & Context |
Flan-T5xxl | October 2022 | 11 billion | 32,000 | Long Sentences & Context |
Flan-T5xxl is a fine-tuned version of T5xxl with improved accuracy.
Model List
The following models are available on Hugging Face:

Model File | Size | Accuracy (SSIM Similarity) | Recommended |
---|---|---|---|
flan_t5_xxl_fp32.safetensors | 44.1GB | 100% | |
flan_t5_xxl_fp16.safetensors | 22.1GB | 99.9% | |
flan_t5_xxl_TE-only_FP32.safetensors | 18.7GB | 100% | 🔺 |
flan_t5_xxl_TE-only_FP16.safetensors | 9.4GB | 99.9% | ✅ |
flan_t5_xxl_TE-only_Q8_0.gguf | 5.5GB | 99.8% | ✅ |
flan_t5_xxl_TE-only_Q6_K.gguf | 4.4GB | 99.7% | 🔺 |
flan_t5_xxl_TE-only_Q5_K_M.gguf | 3.8GB | 98.4% | 🔺 |
flan_t5_xxl_TE-only_Q4_K_M.gguf | 3.2GB | 95.2% | |
flan_t5_xxl_TE-only_Q3_K_L.gguf | 2.6GB | 84.9% |
Flan-T5xxl_TE-only is a lightweight model where only the text encoder portion, necessary for image generation AI, has been extracted.
When using Flan-T5xxl for image generation AI, the full model and the TE-only model produce the same output, so using the TE-only model can save storage space.
The GGUF format is a further lightweight version of Flan-T5xxl_TE-only_FP32 created through quantized compression.
The "Q8" number represents bit depth; smaller numbers mean lighter file sizes but lower accuracy.
For using Flan-T5xxl in image generation, I recommend Q5_K_M or higher.
How to Use Flan-T5xxl!
Place the downloaded files in one of the following folders:
- Installation Folder/models/text_encoder
- Installation Folder/models/clip
- Installation Folder/Models/CLIP
Stable Diffusion WebUI Forge
In Stable Diffusion WebUI Forge, select the Flan-T5xxl model instead of the usual T5xxl_v1_1.

Stable Diffusion WebUI Forge doesn’t support FP32 format, so please use the FP16 or GGUF format models.
ComfyUI
When using ComfyUI, since T5xxl models are large, keeping them in system RAM instead of VRAM can reduce model loading times.
ComfyUI-MultiGPU Custom Node

DualCLIPLoaderMultiGPU and DualCLIPLoaderGGUFMultiGPU Nodes

Using the DualCLIPLoaderMultiGPU or DualCLIPLoaderGGUFMultiGPU nodes from the ComfyUIMultiGPU custom node pack, you can explicitly load the model into system RAM by selecting cpu as the device.
In ComfyUI, you can also use FP32 text encoder formats by enabling the --fp32-text-enc setting at startup.
Try the Improved CLIP-L Too!
In addition to T5xxl, the text encoder CLIP-L can also be upgraded to a more accurate version:
- LongCLIP-SAE-ViT-L-14 Model (ComfyUI only)
- CLIP-SAE-ViT-L-14 Model
Since Flan-T5xxl and CLIP-L serve different roles, upgrading both can further improve image quality.
CLIP-L is lighter than T5xxl, so I highly recommend giving it a try.
Claude and Grok’s Coding Is Amazing!
I can’t write code myself, so all the code used for this conversion was written by AI.
The two AIs I used this time are:
- Claude 3.7 (Released February 25, 2025): Accurate code
- Grok3 (Released February 15, 2025): Web search for the latest info and ability to upload many files
I tried the same conversion two months ago using ChatGPT, but the code didn’t work, and it failed.

This time, I had Claude 3.7 write the initial code, uploaded error messages to Grok 3 for iterative fixes, and as a result, my workflow efficiency improved dramatically, allowing me to complete the model.
If you’re using AI to write code, I highly recommend trying this method!
Conclusion: Give Flan-T5xxl a Try
- Flan-T5xxl is a new-generation text encoder.
- TE-only extracts only the parts needed for image generation.
- GGUF format is an even lighter model.
Image generation AIs like Flux.1 and SD 3.5 can produce high-quality illustrations, and upgrading the text encoder can boost image quality even further.

One of the charms of local image generation is the ability to freely use models openly shared worldwide.
Now that the storage size disadvantage of using the Flan-T5xxl model is gone, why not take this opportunity to upgrade?
Thank you for reading to the end!