Introduction
If you are someone who has been keeping track of the latest trends in text-to-image AI, you have surely heard of Stable Diffusion.
It is the latest cutting-edge model, similar to Dall-E 2 and Mid journey.
The interesting part about Stable Diffusion, unlike the other tools, is — it is open-sourced!
This means anyone can use it for free or even create applications from it!
As we speak, people are already joining in using the Google Colab notebooks.
Let us see what sound diffusion is, the steps on how to create your first notebook and your AI-generated images, and see some stunning images I generated using it (along with the prompts)!
Stable Diffusion AI
Stable Diffusion is a text-to-image model created by a collaboration between engineers and researchers from CompVis, Stability AI, and LAION.
It is based on a model called Latent Diffusion (High-Resolution Image Synthesis with Latent Diffusion Models). The theoretical details are beyond the scope of this article. Perhaps I will cover them in another post.
We need to know that it is one of the most anticipated models to be released recently!
Why?
How to use Stable Diffusion AI ?
Step 1: Sign-up or log in
Sign up or log in to Hugging Face: Hugging Face – The AI community building the future. They create the model you are about to use, and they have given access to their Google Colab notebook
Once you sign up, accept the terms and conditions of using the model.
Step 2: Getting the access token
Now that you are logged in, you need to get the access token you will use later. For this, go to the settings in the account options in the top right corner.
In settings, you will see the option called Access Tokens. Create a new token.
Give your token a suitable name and select the read role.
Copy the generated token to the clipboard; you will need it later in the Jupyter notebook.
Step 3: The notebook
Here you will need to log in to your Google Colab. Once there, access the Stable Diffusion notebook from this link:
Your PC may not have enough capacity to run this model, so connecting it to the hosted runtime is better.
Once it s connected, you are ready to go! Go to the runtime option in the ribbon and Run all cells. You may get a warning saying Google does not author the notebook. Simply click yes and go ahead.
When you reach the 4th cell, you will get the option to enter the access token.
Enter the access token you copied in the previous step. Wait for the Login Successful message.
That is it! You are ready to use the model! The notebook is pretty self-explanatory so that you can try various prompts directly.
Here is the first example they have only with 4 lines of code and the prompt “a photograph of an astronaut riding a horse“.
from torch import autocast prompt = “a photograph of an astronaut riding a horse.” with autocast(“cuda”): image = pipe(prompt)[“sample”][0] # image here is in [PIL format](https://pillow.readthedocs.io/en/stable/) # Now, to display an image, you can either save it such as: image.save(f”astronaut_rides_horse.png”) # or if you’re in a google colab, you can directly display it with image
This gives the resulting image:
From here on, you are only limited by your imagination. Try various combinations of prompts, art styles, and even artist names. AI can generate them well!
Conclusion I hope you got a lot of material to learn from this post. To summarise, this article shows:
An overview of the powerful text-to-image generator Stable Diffusion.
The basics of how it works.
The practical steps on how you can set up the Jupyter notebook and generate images yourself.
The importance of creating prompts, along with some practical examples!
As concluding thoughts, there are a few important takeaways from using text-to-image generators:
Comments