Inference with SSD 1B — A distilled Stable Diffusion XL model that is 50% smaller and 60% faster
HuggingFace Inference with Segminds’s Stable Diffusion 1B
Segmind has open-sourced its latest marvel, the SSD-1B model. A trailblazer in diffusion-based text-to-image synthesis, SSD-1B promises lightning-fast image generation, exhibiting a perfect combination of speed, efficiency, and quality.
SSD-1B is a distilled version of Stable Diffusion XL 1.0 delivering up to 60% more speed in inference and fine-tuning and 50% smaller in size.
Since SSD-1B is compatible with SDXL 1.0, it can be used directly with the HuggingFace Diffusers library just like SDXL 1.0. All the code for fine-tuning and training is also open-sourced in Segmind’s SSD-1B repository.
Let’s look at a few images generated with SDXL 1.0 before proceeding to see how we can do inference using the HuggingFace Diffusers library!
Inference Code
Let’s see how we can do image generation using the HuggingFace Transformers library!
The colab notebook for this tutorial can be found here
Let’s install the necessary libraries and do the necessary initialization:
!pip install --quiet git+https://github.com/huggingface/diffusers.git@d420d71398d9c5a8d9a5f95ba2bdb6fe3d8ae31f
!pip install --quiet ipython-autotime
!pip install --quiet transformers==4.34.1 accelerate==0.24.0 safetensors==0.4.0
!pip install --quiet ipyplot
%load_ext autotime
Let’s write the inference code to generate a single image and also save the generated image.
from diffusers import StableDiffusionXLPipeline
import torch
import ipyplot
pipe = StableDiffusionXLPipeline.from_pretrained("segmind/SSD-1B", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")
prompt = "Cinematic, breathtaking, sleek, a white elegant cat amidst a dense foggy London alley, poised beneath a lamp post, Striking image, 8K, Desktop background, Immensely sharp."
neg_prompt = "ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, blurry, bad anatomy, blurred, watermark, grainy, signature, cut off, draft"
image = pipe(prompt=prompt, negative_prompt=neg_prompt).images[0]
image.save("test.jpg")
ipyplot.plot_images([image],img_width=400)
The above code generates the following image in 1024 X 1024 resolution-
Now let’s improve the code to add a random seed as well as generate multiple images at once -
from diffusers import StableDiffusionXLPipeline
import torch
generator = torch.Generator("cuda").manual_seed(1024)
pipe = StableDiffusionXLPipeline.from_pretrained("segmind/SSD-1B", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")
prompt = "Cinematic, breathtaking, sleek, a white elegant cat amidst a dense foggy London alley, poised beneath a lamp post, Striking image, 8K, Desktop background, Immensely sharp."
neg_prompt = "ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, blurry, bad anatomy, blurred, watermark, grainy, signature, cut off, draft"
allimages = pipe(prompt=prompt, negative_prompt=neg_prompt,guidance_scale=7.5,num_inference_steps=30,num_images_per_prompt=2).images
images = [image for image in allimages]
for idx, image_file in enumerate(allimages, 1):
image_file.save(f"test{idx}.jpg")
ipyplot.plot_images(images,img_width=400)
In case you encounter CUDA memory errors reduce images_per_prompt and rerun after restarting the Colab notebook.
If everything is successful you should see two images:
In the next tutorial, we will see how we can do Dreambooth Lora training with custom objects of our own using Segmind Stable Diffusion 1-B!
Happy AI exploration and if you loved the content, feel free to follow me on Twitter for daily AI content!