A detailed guide to stable diffusion advanced features
Introduction
Stable Diffusion is a cool tool that turns text prompts into amazing images. It’s popular because it can create stunning visuals for art, design, marketing, and entertainment. While its basic features are pretty great, diving into its advanced features can boost your creativity. This guide will simply explain these advanced features so you can use them in your projects.
See also: How to use stable diffusion to generate AI images
1. Core techniques for image generation
Whether you are a beginner or an advanced user of stable diffusion you need to know how to control the way it interprets your prompt, the quality of the generated image, and the sizing of the image. These three things can be easily controlled in the txt2img tab under the “generation” sub-tab in stable diffusion WebUI, by modifying Sampling Steps, CFG Scale, and Resolution values.
Sampling steps
Sampling steps refer to the number of iterations the model goes through while generating an image. Think of the sampling steps as the number of times the model refines its output. A perfect example is if you ask an artist to draw an image of a rose in a short time frame without making a lot of revisions. The drawing the artist produces will be less detailed, but if you tell the same artist they can make corrections and give them more time to work on the drawing, they would produce a better illustration. That’s how sampling steps work.
More sampling steps typically lead to better image quality, as the model has more opportunities to adjust and improve the details. However, increasing sampling steps also means longer wait times for results. A good starting point is around 20-30 steps, but you can experiment with this number based on your needs.
CFG scale
The CFG (Classifier-Free Guidance) scale determines how closely the image should follow your text prompt. A higher scale means the model sticks to your prompt, while a lower scale allows stable diffusion to be creative. If you want a very specific image, a CFG scale of 7-10 works well. For more artistic freedom, go lower.
Resolution
Resolution values determine the size and clarity of the image. Higher resolutions provide more detail but require more computing power and time. Start with smaller resolutions while exploring designs and then scale up as needed. For example, if you want a final image of 1080×1080, start with 512×512 to balance quality and performance.
2. Advanced prompting strategies
When it comes to AI image generation, your prompts are your superpower. With effective prompting techniques, you can create impressive images even with less advanced AI models. Stable Diffusion takes prompting a step further by offering tools that enhance your prompting capabilities. Here are some key features to explore.
Prompt weighting
Prompt weighting is a powerful feature in Stable Diffusion that lets you emphasize specific parts of your input over others when generating images. By assigning varying levels of importance (weights) to words or phrases in your prompt, you can guide the model to focus on certain details, styles, or elements more effectively.
How prompt weighting works
In most Stable Diffusion interfaces, such as Automatic1111, Fooocus, or ComfyUI, prompt weighting follows a similar syntax:
- Increase weight: Enclose the word or phrase you want to emphasize in parentheses (). The more parentheses you add, the stronger the emphasis.
- Example:
"golden retriever wearing yellow (sunglasses)"
gives more focus to “sunglasses.”
- Example:
- Decrease weight: Enclose the word or phrase in square brackets []. This reduces its importance.
- Example:
"golden retriever wearing [yellow] sunglasses"
reduces the importance of “yellow.”
- Example:
Advanced weighting with values
You can also assign explicit weight values using this syntax: (keyword:weight)
.
- Increase weight: Use a value greater than
1
to emphasize a word or phrase. - Decrease weight: Use a value less than
1
to reduce emphasis.
For instance:
- If you want to emphasize “sunglasses” in the prompt “golden retriever wearing yellow sunglasses”, you could write:
"golden retriever wearing yellow (sunglasses:1.5)"
- To downplay the color of the sunglasses in the same prompt, you might write: “golden retriever wearing (yellow:0.8) sunglasses”
Why use prompt weighting?
Prompt weighting gives you fine-tuned control over the output. It’s especially useful for:
- Highlighting specific features or objects.
- Achieving a desired style or vibe in your image.
- Experimenting with creative ideas in a more precise way.
With these tools, you can craft prompts that make your images stand out, even in highly detailed or stylistic scenarios.
Negative prompts
Negative prompts help specify what you don’t want in your final output. If you’re generating an image but want to avoid certain styles or elements (like “no dark colors”), including negative prompts can further refine your results. When using negative prompts, you can also apply weighting to emphasize the elements you don’t want to appear.
3. ControlNet integration
ControlNet is a tool in Stable Diffusion that gives you more control over how images are created. It lets you add extra details, like sketches or photos, along with your text prompts, to guide the process and make the results match your vision more closely.
What is the purpose of ControlNet?
Normally, Stable Diffusion creates an image based only on the text you type. With ControlNet, you can also upload a picture or a simple drawing. This extra input acts like a guide to help the AI understand what you want, making the final image look closer to your desired style or theme.
How to use ControlNet
Here’s how to get started with ControlNet in the Stable Diffusion WebUI:
- Install the ControlNet extension
- Find the ControlNet section: Go to the
txt2img
tab in the WebUI. - Add extra input (optional): You can upload a drawing, photo, or other details (like outlines or poses) to steer the image creation.
- Create your image: Generate your image with these extra details included.
ControlNet makes it easier to create images that match what you have in mind by giving the AI clearer instructions. It’s a simple way to get results that feel more personal and detailed.
4. Image modification techniques
Stable Diffusion is best known for creating images from scratch, but it’s also a powerful tool for editing and enhancing existing images. You can use it to make changes, add elements, or even create entirely new versions of an image. Here are some key techniques for working with existing images:
Basic Image Modification
To edit an image or apply a new style, the img2img feature is your go-to tool. With img2img, you can:
- Apply a fresh style.
- Add new elements.
- Alter specific features while keeping the overall structure intact.
To modify an image, upload it in the img2img tab of the Stable Diffusion WebUI. Enter a prompt that describes the changes you want, and generate a new version of the image with those edits.
Upscaling and Enhancing Images
If you have an image that needs higher resolution or better detail, Stable Diffusion can upscale it for you. This is useful for making images suitable for printing or high-quality displays.
Here’s how to upscale an image:
- Open the Extras tab in the WebUI.
- Upload your image.
- Adjust the “Scale By” value to set the new size.
- Choose an upscaler, if needed.
- Generate your image.
Turning Drawings into Digital Images
One of the most exciting features in Stable Diffusion is transforming simple sketches into stunning digital artworks. To do this, you’ll need to install the ControlNet extension mentioned earlier in this guide.
Here’s how to turn your drawing into an image:
- Go to the img2img tab and enter your prompt.
- Open the ControlNet section and enable it by checking the box.
- Upload your sketch by clicking the upload area or dragging the file in.
- Select the Scribble button and choose a scribble model.
- Click Generate to create your image.
These tools make Stable Diffusion a versatile option for modifying images and bringing creative ideas to life, from simple edits to complex transformations.
5. Image inpainting and outpainting
Inpainting
Inpainting is a way to fix missing parts of an image or remove unwanted areas seamlessly. For example, if an object in your photo distracts from its beauty, inpainting can erase it and fill in the background naturally.
To use inpainting in Stable Diffusion:
- Create a mask by drawing over the area you want to change.
- Write a prompt describing what should replace that area.
- Go to the img2img tab, select the inpaint option, upload your image, create the mask, and enter your prompt.
- Generate the new image.
For the best results, use an inpainting model to make edits look more natural.
Outpainting
There are cases where you might generate an image that you like, with a good subject but wish you could have an extended version of that image. Outpainting allows you to expand your generated image by adding content outside its original edges. It’s great for creating panoramic views or adding more context to a scene.
To use outpainting in Stable Diffusion:
- Go to the PNG Info tab and drag in a previously created image.
- Click “Send to img2img” and select “Inpainting” to copy the image.
- Mark the edges of the image where you want to expand it.
- Adjust settings like resize mode, denoising strength, and ControlNet for fine-tuning.
- Generate the extended image.
Using an inpainting model improves results, ensuring the new content blends smoothly with the original.
6. Leveraging LoRAs for enhanced generation
LoRAs (Low-Rank Adaptations) are tools that improve specific parts of image generation without needing a lot of computational power. Here’s how to use them in your workflow:
- Pick a LoRA: Look for one that matches what you want to achieve (like making landscapes look better) and download it.
- Install the LoRA: Add it to the Stable Diffusion WebUI.
- Activate the LoRA: Go to the “Text to Image” tab, select your LoRA from the LoRA subtab, and it’ll be included in your prompt.
- Read the LoRA’s documentation to understand how to apply it.
- Experiment: Try different settings until your image looks just right.
7. Checkpoint merger
If you enjoy experimenting and creating unique ideas, the checkpoint merger feature might become your favorite tool.
Checkpoint merger, available in Automatic1111, lets you combine different models, especially those used in Stable Diffusion. By merging models, you can create a new one that blends the features and styles of multiple models. You can even control how much influence each model has on the final result.
This tool is perfect for crafting unique visual styles by mixing and matching artistic elements. The possibilities are as limitless as your creativity!
Final notes
Learning Stable Diffusion’s features from basic to advanced allows you to have more control over your images, making it easier to achieve your desired results. Take the time to master each feature, and happy creating with Stable Diffusion!