Process

Guide: Stable Diffusion’s CFG Scale Explained

In Stable Diffusion, CFG stands for Classifier Free Guidance scale. CFG is the setting that controls how closely Stable Diffusion should follow your text prompt. It is applied in text-to-image (txt2img) and image-to-image (img2img) generations.

The higher the CFG value, the more strictly it will follow your prompt, in theory. The default value is 7, which gives a good balance between creative freedom and following your direction. A value of 1 will give Stable Diffusion almost complete freedom, whereas values above 15 are quite restrictive.

The Stable Diffusion Web UI will restrict CFG to positive numbers, with a minimum of 1 and a maximum of 30. However, if you are using Stable Diffusion via a Terminal, it is possible to set CFG as high as 999, as well as set it to be negative. A negative CFG means that you want Stable Diffusion to generate the opposite of your text prompt. However, this is not a common practice since using negative text prompts will give you much more predictable results that have a higher likelihood of representing what you want.

CFG scale setting in the txt2img tab of the Stable Diffusion Web UI
The CFG Scale setting in the default txt2img tab of the Stable Diffusion Web UI. CFG is also used in img2img.

How CFG Effects the Quality of Output Images

Using CFG to control how closely Stable Diffusion follows your text prompt sounds straight forward enough, but sadly Stable Diffusion is not quite that simple. There are some tradeoffs that come with different CFG values. To demonstrate them, here is a specific example using the Euler A Sampling Method and 20 Sampling Steps, which are the default settings in the Stable Diffusion Web UI.

Effects of increasing CFG on output image quality in Stable Diffusion
Effects of increasing CFG on output image quality in Stable Diffusion

From this example you can notice a few things:

  • Color saturation increases as CFG increases
  • Contrast increases as CFG increases
  • Above a certain CFG value, output images become blurrier, resulting in loss of detail

To counteract the decrease in output image quality at higher CFG values, you can generally do two things:

  • Increase sampler steps: the general rule of thumb is that more sampler steps will result in more detail in the output image, although like CFG, that rule applies only up to a certain threshold. Keep in mind that more sampler steps will generally result in longer processing times.
  • Change sampler methods: Some samplers were developed specifically to run optimally at lower or higher CFG and sample steps. For example, UniPC can return good results with CFG as low as 3, but often sees quality degradation starting around a CFG of 10. On the other hand, DPM++ SDE Karras generally produces lots of image detail at CFG values greater than 7.

To get the best output images while minimizing memory and processing time, users need to find a balance between CFG, sampler steps and samplers for the system they’re using.

Start Playing With CFG in Stable Diffusion

If you want use Stable Diffusion and see how adjusting CFG can improve your AI generated images, here are a few options to get you quickly started without having to download and install it yourself:

What are Seeds and How to Use Them
How To Set Up ControlNet Models in Stable Diffusion
Prompts for Color and Image Adjustment in Stable Diffusion
How to Get All the Features of Stable Diffusion Without Installing It Yourself
How to Train a Custom Embedding in Stable Diffusion Tutorial
Stable Diffusion Tutorial: How to In Paint