7 Text-To-Image AI Models: Tested

I look at the seven main players in AI image generation.

Daniel Nest

Dec 14, 2023

Hey, remember when I demoed six text-to-video sites?

Cat genitals, psychedelics, creepy chimeras, and other shenanigans? Ring a bell?

Today, I want to do the same for AI images. (Hopefully with 100% fewer cat penises.)

By my latest count, we now have seven primary public text-to-image models1:

DALL-E 3 (OpenAI)
Emu (Meta)
Firefly Image 2 (Adobe)
Ideogram (Ideogram)
Imagen (Google)
Midjourney 5.2 (Midjourney)
SDXL (Stability AI)2

Let’s check out the images they generate and learn more about the models.

The process

This won’t be a deep-dive showdown like my SDXL 1.0 vs. Midjourney 5.2 post.

Instead, I’ll briefly introduce each model and showcase the visuals it generates. To keep things consistent and comparable, I’ll be using the same 6 prompts for each model:

Tulips in a meadow, golden hour, watercolor painting

Parrot on a branch, wildlife photography, National Geographic

Portrait of a woman wearing sunglasses, pencil sketch

Abstract shapes, acrylic paint

Ice cream shop, minimal line logo

Colorful banner that says “Digital Art”

I tried to pick prompts that cover a range of picture types, art mediums, and styles.

Because some of the models can only generate square images at the moment, I’ll be sticking to the 1:1 aspect ratio for all images.

Off we go!

This post might get cut off in some email clients. Click here to read it online.

1. DALL-E 3 (Open AI)

DALL-E 3 is the latest image model from OpenAI, having replaced DALL-E 2 in October 2023.

What makes DALL-E 3 special is its ability to faithfully follow long, detailed prompts, thanks to being trained on images with synthetically enriched captions.

DALL-E 3 is great for cartoons with speech bubbles and other images that include writing, because it tends to handle text better than most other models.

Sample images:

DALL-E 3 at a glance:

Interface: Web
Standout features: Prompt adherence and text generation
Is free? Yes (via Bing)
Where to try: Bing Image Creator (free) or ChatGPT (for paid Plus users)

Check out DALL-E 3

2. Emu (Meta)

Emu, by Meta AI, was first announced in early October 2023, started rolling out to select users in November, and finally became available to all US residents through a standalone site in early December.

The interface is pretty barebones for now: just a simple text box to input your prompt, which generates four alternative square pictures.

Sample images:

Emu at a glance:

Interface: Web (and inside Meta products like Facebook and WhatsApp)
Standout feature: Built-in watermarking for transparency
Is free? Yes
Where to try: imagine.meta.com (if you’re in the US or use a VPN)

Check out Emu

3. Firefly Image 2 (Adobe)

First announced at the Adobe MAX conference in October, Firefly Image 2 replaced the first version of Adobe’s in-house image model.

It’s available as a standalone Adobe Firefly site as well as powering the company’s suite of products including Adobe Photoshop and Adobe Illustrator.

Because Firefly Image 2 is built into more advanced Adobe interfaces, you can use it for inpainting, changing image styles, and a whole lot more.

Sample images:

Firefly Image 2 at a glance:

Interface: Web (and inside most Adobe products)
Standout features: Additional editing options like style transfer, text-to-vector, text effects, generative fill, recolor, and more.
Is free? Yes (25 credits per month)
Where to try: firefly.adobe.com

Check out Firefly Image 2

4. Ideogram (Ideogram)

Ideogram arrived seemingly out of nowhere in late August 2023.

It’s the only text-to-image model on the list by a company that wasn’t on the scene until this year. The founders of Ideogram all previously worked on Google’s Imagen (see below) before leaving to start their own thing.

Ideogram is trained from scratch to solve the issue of gibberish text inside images. It was the first model to reliably generate text3 before DALL-E 3 caught up.

Sample images:

Ideogram at a glance:

Interface: Web
Standout features: Text generation and image remixing
Is free? Yes (25 prompts per day)
Where to try: Ideogram.ai

Check out Ideogram

5. Imagen (Google)

The Imagen research paper first came out back in May 2022, when Midjourney was just starting out and DALL-E 2 wasn’t out yet. Then, while Midjourney and OpenAI rapidly iterated and released public-facing image models, Google just sat on its research. (I even threw a mocking jab at it in this post.)

But in October 2023, Google quietly made image generation available within SGE (Search Generative Experience), using Imagen.

Then, just as I was writing this article and testing the model, Google announced Imagen 2, which so far is only available to developers via Vertex AI.

As far as I know, my images below use Imagen 1, so the text accuracy caught me off guard.

Sample images:

Imagen at a glance:

Interface: Web
Standout features: Text generation, prompt understanding, watermarking
Is free? Yes (if you have SGE available and enabled)
Where to try: Google search (type “draw a picture of [prompt]”)

Check out Imagen

6. Midjourney 5.2 (Midjourney)

Ah, Midjourney, my greatest obsession4.

Trained by a relatively small team that shunned all external funding, Midjourney is still seen as the gold standard within image generation. The latest version is 5.2, but version 6 is just around the corner.

Midjourney continues to thrive despite the inconvenient Discord interface and the lack of a free plan. No small feat.

Sample images:

Midjourney at a glance:

Interface: Discord (but the alpha web version is out for power users and coming soon to all)
Standout features: Inpainting, outpainting, Style Tuner, blend, and more
Is free? No (plans start at $10 / month)
Where to try: Midjourney.com

Check out Midjourney

7. SDXL (Stability AI)

Stable Diffusion is why this newsletter exists; it’s the first text-to-image model I ever tried.

Stable Diffusion is currently the only open-source5 text-to-image model. It can be downloaded and installed locally, customized, and iterated upon to create even better spinoff models (like Playground 2).

Stable Diffusion XL (SDXL) is the latest “vanilla” version, which I compared to Midjourney 5.2 a few months ago.

Sample images:

SDXL at a glance:

Interface: Web and local install (also Discord, if you insist)
Standout feature: Open-source, infinitely customizable, can run locally
Available for free? Yes
Where to try: Dozens of image creation sites

Check out SDXL

Observations

The most obvious conclusion here is that text-to-image models are converging. At the start of the year, there were clear leaders. Now, it’s often impossible to tell AI image models apart in terms of quality.

While Midjourney might still have a slight edge, the gap is closing fast! We’ll see if version 6 does anything to shake that up.

But the biggest surprise by far was Imagen.

Google released it without much fanfare, so I didn’t realize just how good it was, especially when it comes to text. Until now, I thought Ideogram and DALL-E 3 were the only models capable of rendering text.

The sample image I picked wasn’t a fluke. Here’s the entire grid:

Colorful banner that says “Digital Art” by Imagen - all four images are spelled correctly

Yup: Correctly spelled text in every single image. If this is indeed only Imagen 1, I can’t wait to see what Imagen 2 brings to the table.

But Google also has unexpectedly strict filters when it comes to generating people.

It took around 10 tries to get the pencil sketch almost by accident, after which Google refused to return more images.

All in all, we are truly spoiled for choice.

It’s incredible that—just one year after Stable Diffusion’s debut—seven models of this caliber are available to us, for free…with the notable exception of Midjourney.

What a time to be alive!

Side-by-side comparison

Here’s a look at all the 7 models and 6 prompts in a single image:

Over to you…

What’s your favorite AI image model? Do you agree with my observations? Have I overlooked a model in this roundup?

As always, I’d love to hear your input. Send an email to whytryai@substack.com or leave a comment below.

Why Try AI

Discussion about this post