Generate Kids’ Story Illustrations with AI in Just 1 Click Programmatically: Full Tutorial

Tutorial using OpenAI’s ChatGPT API and SDXL API from Segmind.com

Ramsri Goutham
8 min readOct 12, 2023
Children’s story illustrations with AI

Introduction

Illustrations play a vital role in children’s stories, captivating young readers and bringing the narrative to life. However, creating these illustrations manually can be a time-consuming and resource-intensive process, especially for authors and storytellers who may not have artistic skills. This is where the power of generative AI comes into play.

Automatic illustration generation for children’s stories is not only useful but also transformative, as it allows storytellers to effortlessly complement their tales with vibrant and tailored visuals, enhancing the overall reading experience and enabling them to share their creativity with the world in a more engaging manner.

In this blog post, we’ll see how to use AI to automatically generate illustrations for children’s stories with just a click of a button. This allows us to create custom images tailored to the story instead of using random stock photos. We’ll use OpenAI’s ChatGPT API and Segmind’s Stable Diffusion XL (SDXL) to extract illustration prompts from text and generate images. Let’s get started!

We will create a simple UI with Gradio where you can paste in any text, pick a style like watercolor, lineart, etc, and get illustrations.

Gradio App UI

Step 1: The Theory

We will first take the text, use OpenAI’s ChatGPT API, and extract a list of illustration prompts.

Extract Illustration prompts from any text input using OpenAI

Once we extract a few illustration prompts from the input text or story, we are going to use Segment’s API which takes in each illustration description or prompt and generates a relevant image for that prompt.

Segmind uses SDXL (Stable Diffusion XL 1.0) to generate high-quality illustrations in the style that we give.

Images generated using Segmind’s API from illustration descriptions

Step 2: Signup for services and get the API Keys

To get started, you’ll need to sign up for two important services: Segmind and OpenAI. We are going to use OpenAI’s ChatGPT API to extract relevant illustration descriptions from text and use Segmind’s text-to-image API to generate high-quality illustrations from illustration descriptions.

Get OpenAI’s API Key

Go to platform.openai.com
Sign up or log in.
Click on the top right to go to “View API Keys”.
Create a new secret key if you don’t have one already and save it securely.
If necessary add your card to not run out of credits.

Get Segmind’s API Key

Go to Segmind.com and Log in/Sign up
Click on the top right to go to the console.
Once in the console, click on the “API keys” tab and “Create New API Key”.
You get a few free credits daily but for interrupted usage, you can go to “billing” and “add credits” by paying with your card.

If you want to know how much cost each API call incurs for Segmind, you can go to the corresponding model’s pricing tab and see it. An example of SDXL pricing is shown here.

Step 3: The Code

The Google Colab notebook containing the full code can be found here.

Install the necessary Python libraries and enter your API keys from the above step when prompted!

!pip install --quiet segmind==0.2.3
!pip install --quiet gradio==3.43.2

from getpass import getpass
openaikey = getpass('Enter the openai API key: ')
segmindkey = getpass('Enter the segmind API key: ')

Now let’s define two main functions fetch_illustrator_prompts and generate_prompt.

import requests
import json
from pprint import pprint

chatgpt_url = "https://api.openai.com/v1/chat/completions"
chatgpt_headers = {
"content-type": "application/json",
"Authorization":"Bearer {}".format(openaikey)}

def fetch_illustrator_prompts(prompt, url, headers, max_retries=3):
retries = 0

while retries < max_retries:
# Define the payload for the chat model
messages = [
{"role": "system", "content": "You are an expert author who can read children's stories and create short briefs for an illustrator, providing specific instructions, ideas, or guidelines for the illustrations you want them to create."},
{"role": "user", "content": prompt}
]

chatgpt_payload = {
"model": "gpt-3.5-turbo-16k",
"messages": messages,
"temperature": 1.3,
"max_tokens": 2000,
"top_p": 1,
"stop": ["###"]
}

# Make the request to OpenAI's API
response = requests.post(url, json=chatgpt_payload, headers=headers)
response_json = response.json()

try:
# Extract data from the API's response
output = json.loads(response_json['choices'][0]['message']['content'].strip())
# pprint(output)
return output
except json.JSONDecodeError as e:
print(f"JSON decode error: {e}")
retries += 1
print(f"Retry attempt {retries} out of {max_retries}")

# If max retries are reached without a successful response, return None or handle it as needed.
return None

def generate_prompt(text, count):
prompt_prefix = """{}
------------------------------------
Generate {} short briefs as a list from the above story to give as input to an illustrator to generate relevant children's story illustrations.
Strictly add no common prefix to briefs. Strictly generate each brief as a single sentence that contains all the necessary information.
Strictly output your response in a JSON list format, adhering to the following sample structure:""".format(json.dumps(text), json.dumps(count))

sample_output = {"illustrations": ["...", "...", "..."]}

prompt_postinstruction = "\nOutput:"

return prompt_prefix + json.dumps(sample_output) + prompt_postinstruction

The generate_prompt function takes in any text along with the count and creates a prompt to be passed into the fetch_illustrator_prompts function which returns the required number of illustration descriptions as a JSON list.

For example, you can take text from any story like “The Boy who cried wolf” and call the generate_prompt and fetch_illustrator_prompt functions like the following:

text = """Once, there was a boy who became bored when he watched over the village sheep grazing on the hillside. To entertain himself, he sang out, “Wolf! Wolf! The wolf is chasing the sheep!”
When the villagers heard the cry, they ran up the hill to drive the wolf away. But when they arrived, they saw no wolf. The boy was amused when he saw their angry faces.
“Don’t scream wolf when there is no wolf, boy!” the villagers warned. They angrily went back down the hill.
Later, the shepherd boy cried out once again, “Wolf! Wolf! The wolf is chasing the sheep!” To his amusement, the villagers came running up the hill to scare the wolf away.
As they saw there was no wolf, they said strictly, “Save your frightened cry for when there really is a wolf! Don’t cry ‘wolf’ when there is no wolf!” But the boy grinned at their words while they walked, grumbling down the hill once more.
Later, the boy saw a real wolf sneaking around his flock. Alarmed, he jumped on his feet and cried out as loud as he could, “Wolf! Wolf!” But the villagers thought he was fooling them again, and they didn’t come to help.
At sunset, the villagers went looking for the boy who hadn’t returned with their sheep. When they went up the hill, they found him weeping.
“There was a wolf here! The flock is gone! I cried out, ‘Wolf!’ but you didn’t come,” he wailed."""

count = 3
prompt = generate_prompt(text, count)

illustrator_prompts = fetch_illustrator_prompts(prompt,chatgpt_url,chatgpt_headers)
print (illustrator_prompts)
for each in illustrator_prompts['illustrations']:
print (each)

This illustrator_prompts[“illustration”] will be a list of 3 descriptions that are generated from the story like this:

1. “Illustrate the shepherd boy standing on a hillside looking bored as he watches the village sheep grazing”

2. “show the villagers running up the hill with angry faces, believing the boys cries about a wolf chasing the sheep”

3. “create an illustration of the shepherd boy weeping on the hillside as the villagers discovered the flock is gone and realized the boy was telling the truth about the wolf”

Now we can use Segmind’s SDXL API to pass each of these descriptions along with a style and get a corresponding illustration for each.

from segmind import SDXL
import os
import io
import requests
from PIL import Image
import random

model = SDXL(segmindkey)

def generate_images(prompts, style):
all_images =[]

num_images = len(prompts)

currentseed = random.randint(1, 1000000)
print ("seed ",currentseed)

negative_prompt = "photorealistic, realistic, photograph, deformed, mutated, stock photo, 35mm film, deformed, glitch, low contrast, noisy"

for i, prompt in enumerate(prompts):

final_prompt = "{},{}".format(prompt.replace('.',","),style)
img = model.generate(prompt = final_prompt,negative_prompt=negative_prompt,samples = 1,style = style,scheduler="UniPC",
seed =currentseed, num_inference_steps=30)


print (f"Image {i + 1}/{num_images} is generated")
# img will be a PIL image
all_images.append(img)

return all_images

style = "watercolor"
all_images = generate_images(illustrator_prompts['illustrations'],style)
Visualizing all_images from the above code

The above code results in images like the ones generated above for the corresponding description. Note that not every image generated is perfectly relevant to the description provided. But getting at least one relevant image for the story is good enough if you are publishing a blog post etc instead of using generic free stock images that don’t fit the story at all.

Step 4: Creating a UI with Gradio

Finally, we’ll create a simple UI with Gradio that allows us to:

  • Enter text for the story
  • Pick an illustration style
  • Click a button to generate images
  • Display a gallery to preview and download illustrations

The interface connects the text and style inputs to our Python functions to fetch illustrations on demand with a single click.

import gradio as gr
import shutil
import os

# https://momlovesbest.com/short-moral-stories-kids

story_1= """Once, there was a boy who became bored when he watched over the village sheep grazing on the hillside. To entertain himself, he sang out, “Wolf! Wolf! The wolf is chasing the sheep!”
When the villagers heard the cry, they ran up the hill to drive the wolf away. But when they arrived, they saw no wolf. The boy was amused when he saw their angry faces.
“Don’t scream wolf when there is no wolf, boy!” the villagers warned. They angrily went back down the hill.
Later, the shepherd boy cried out once again, “Wolf! Wolf! The wolf is chasing the sheep!” To his amusement, the villagers came running up the hill to scare the wolf away.
As they saw there was no wolf, they said strictly, “Save your frightened cry for when there really is a wolf! Don’t cry ‘wolf’ when there is no wolf!” But the boy grinned at their words while they walked, grumbling down the hill once more.
Later, the boy saw a real wolf sneaking around his flock. Alarmed, he jumped on his feet and cried out as loud as he could, “Wolf! Wolf!” But the villagers thought he was fooling them again, and they didn’t come to help.
At sunset, the villagers went looking for the boy who hadn’t returned with their sheep. When they went up the hill, they found him weeping.
“There was a wolf here! The flock is gone! I cried out, ‘Wolf!’ but you didn’t come,” he wailed."""

def create_illustrations(textInp, styleInp):

prompt = generate_prompt(textInp, 3)
illustrator_prompts = fetch_illustrator_prompts(prompt, chatgpt_url, chatgpt_headers)
pil_images = generate_images(illustrator_prompts['illustrations'], styleInp)

return pil_images

topics = ["watercolor", "comic book", "kawaii", "line art"]

with gr.Blocks() as demo:
gr.Markdown("# Kid's story illustrations generator")
text_input = gr.Textbox(value=story_1, label="Enter Text", lines=4,max_lines=6)
genre = gr.Dropdown(choices=topics)
btn_create_illustrations = gr.Button('Generate Illustrations')
gallery = gr.Gallery(label="images", columns=3)

btn_create_illustrations.click(fn=create_illustrations, inputs=[text_input, genre], outputs=[gallery])

demo.launch(debug=True, enable_queue=True)
UI Built with Gradio

Conclusion

Automatically generating illustrations for children’s stories has never been easier. This powerful combination of OpenAI’s ChatGPT and Segmind’s SDXL model makes it possible to bring your stories to life with just a few clicks. Instead of using generic stock photos, you can now create custom illustrations that perfectly match your story. Hope you had a great learning experience where we explored a practical use-case of AI!

Happy AI exploration and if you loved the content, feel free to follow me on Twitter for daily AI content!

--

--