Member-only story

5 ControlNet resources to try or deploy on your own

ControlNet gives more control over Stable Diffusion image generation

3 min readMar 9, 2023

Stable diffusion enabled anyone to generate images from text prompts. But it was lacking visual guidance. ControlNet solves that problem where you can give an input source image along with input text as inspiration (or control) to Stable Diffusion’s image generation.

In short, you can give an input image to ControlNet and it can extract an intermediate representation of the image like a segmentation image, depth image, pose estimation, edge detection map, etc.

This intermediate representation along with the text prompt is passed onto the stable Diffusion model to generate a new image. You can even combine multiple intermediate representations (Multi-ControlNet) and give as input to stable diffusion.

Image and it’s an intermediate representation (Depth Mask)

Then you can combine the above depth map with a text prompt like “An American woman with blonde hair, smiling, a world map outline on the classroom board, a well-lit colorful photograph” to get the below picture in the same pose and style.

5 ControlNet resources to try or deploy on your own

ControlNet gives more control over Stable Diffusion image generation

Written by Ramsri Goutham

No responses yet