Is Negative Prompt Necessary? Unleashing AI’s Creativity!


- Keep CFG Scale low.
- Use Negative Prompts sparingly.
- Let AI create freely and observe its original illustrations
Introduction
Hello, this is Kimama / Easygoing.
Today, let’s dive into the topic of Negative Prompts in image generation AI.
Theme: Nighttime Snapshot
This time, our theme is a nighttime snapshot.

We’ll capture the fleeting expression of someone spotted while walking through the town.
How Does AI Interpret Prompts?
We input prompts while imagining the images we want AI to generate.

- Red: Target Image
- Green: Conditioning (The image generated based on the input prompt)
The process of linking the input prompt to the actual image is handled by CLIP, resulting in an image referred to as Conditioning.
When Prompts Fall Short
CLIP is trained on image and text captioning, but its accuracy is not particularly high.
In many cases, the effect of the input prompt is less impactful than we expect.
As a result, image generation AI requires some method to enhance the prompt’s effectiveness.
Boosting Prompts with CFG Scale!
The first tool used to enhance prompt effectiveness is the CFG Scale.

- CFG Scale: Classifier-Free Diffusion Guidance Scale
A diffusion guidance scale without classifiers (no external models).
CFG Scale amplifies the prompt’s effect based on the input value.
The Importance of a Baseline!
When using CFG Scale, the reference point is crucial.
All image generation AIs generate various types of noise during the training phase or computational processes.

Even if you generate an image without inputting a prompt, the resulting image will be affected by this noise, deviating from the true zero point.
Using Unconditioning as the Baseline!
Thus, the image generated without an input prompt is called Unconditioning, and it’s used as the baseline.
When generating an image, an arrow is drawn from Unconditioning to Conditioning, and by applying CFG Scale, the prompt’s effect is amplified.

As shown in the graph, using CFG Scale brings the generated image closer to the target.
Shifting the Baseline with Negative Prompt!
Building on this mechanism, the Negative Prompt was developed.
As mentioned earlier, Unconditioning is typically generated with an empty prompt.


Left: Without Negative Prompt | Right: With Negative Prompt
In contrast, Negative Prompt involves inputting a prompt into Unconditioning, shifting the baseline itself.
In the right graph, using Negative Prompt adjusts the baseline, changing the arrow’s direction and bringing the image closer to the target.
Negative Prompt Has a Big Impact!
Since Negative Prompt shifts the baseline, it has a significant impact on the entire illustration.
Negative Prompts typically include things you don’t want to generate, but their effect goes beyond simply avoiding specific elements.

With the many settings in modern image generation AI, shifting the baseline too much with Negative Prompt can make you lose your bearings.
When using Negative Prompt, it’s essential to maintain balance and aim for minimal usage.
When Is Negative Prompt Effective?
So, when is Negative Prompt particularly useful?
Stable Diffusion 1 Had Low Prompt Fidelity
In the era of Stable Diffusion 1, prompt fidelity was limited due to CLIP’s constraints.

The CLIP-L in Stable Diffusion 1 had a limited vocabulary and was trained on restricted data.
To improve prompt fidelity in Stable Diffusion 1, it was necessary to set a high CFG Scale and input many Negative Prompts to pull Unconditioning in a negative direction, bringing the result closer to the target.
Cases Where Custom Models Specify Negative Prompt Use
Even with SDXL, Negative Prompt can be effective in specific cases.
Namely, when custom models explicitly recommend using Negative Prompt.

Example: Recommended Negative Prompt for Animagine-XL 3.1
nsfw, lowres, (bad), text, error, fewer, extra, missing, worst quality, jpeg artifacts, low quality, watermark, unfinished, displeasing, oldest, early, chromatic aberration, signature, extra digits, artistic error, username, scan,
In such cases, the custom model is tuned with the expectation of Negative Prompt input, so using Negative Prompt results in higher-quality images.
The Direct Approach: Improving CLIP!
While CFG Scale and Negative Prompt can improve illustration quality when used well, they are quite challenging to fine-tune.
For improving prompt fidelity, the direct approach is enhancing CLIP.

By improving CLIP’s performance, Conditioning can be brought closer to the target direction, allowing higher prompt fidelity without relying on CFG Scale or Negative Prompt.
SDXL: CLIP-G, a Major Upgrade from CLIP-L
Improved CLIP-L: Matching CLIP-G’s Performance
Among the various methods tried to improve image quality, upgrading CLIP-L has been the most effective in my experience.
Update: December 31, 2024
I compared the effects of the improved CLIP-L with actual images.
Powerful Support with T5xxl
Another approach to improving prompt fidelity is the introduction of T5xxl.
Unlike CLIP, T5xxl doesn’t have image recognition capabilities, but it leverages advanced text comprehension to optimize the input prompt before passing it to CLIP.

Since June 2024, Stable Diffusion 3 and Flux.1 have incorporated T5xxl, significantly improving Conditioning accuracy, and it’s been announced that Negative Prompt is no longer necessary.
What Is AI’s Creativity?
Now, let’s explore AI’s creativity in the remaining part of the article.
What kind of images does AI envision after receiving our prompts?

When considering AI’s freest expression, Conditioning represents the image AI imagines from the prompt.
Outputting Conditioning Directly
Let’s try outputting Conditioning as is.
There are two ways to output Conditioning:
Set CFG Scale to 1
From the earlier graph, CFG Scale can be calculated as follows:

- Generated Image = Unconditioning + (Conditioning − Unconditioning) × CFG Scale
When CFG Scale is 1, Unconditioning is canceled out, as shown in the formula.
While some minor errors may remain, setting CFG Scale to 1 minimizes Unconditioning’s influence to near zero.
Input the Same Prompt for Positive and Negative
Another method is to input the exact same prompt for both Positive and Negative.

In this case, Conditioning and Unconditioning are identical, completely eliminating Unconditioning’s influence.
In this state, AI depicts the image directly inspired by the prompt, showcasing maximum creativity.
Does Flux.1 Lack Variety?
Flux.1 and SD 3.5 have greatly improved prompt fidelity with the introduction of T5xxl.
Launched in August 2024, Flux.1 boasts an impressively high success rate and exceptional quality in illustration generation.

However, while using Flux.1, I’ve noticed fewer opportunities to encounter surprisingly unique illustrations compared to before.
T5xxl’s strength in interpreting prompts might, in some cases, make it overly considerate.
I’m still experimenting with recreating SDXL’s variability in Flux.1.
Conclusion: Unleashing AI’s Creativity!
- Keep CFG Scale low.
- Use Negative Prompt minimally.
- Explore AI’s raw illustrations.
When I started with image generation AI, I thought it was a tool to create the illustrations I wanted.
But seeing AI’s incredible composition and expressiveness, I’ve come to believe that our role is to let AI generate freely and gently guide it toward the final product.
After much thought, I now set CFG Scale to around 1–2.
I’m grateful for the free availability of improved CLIP and high-quality models, and I look forward to continuing to enjoy image generation.
Thank you for reading to the end!