Last week, taking a page from Santa’s playbook, Midjourney sneakily gifted us Version 6, the next iteration of its text-to-image model.
Midjourney V6 is in alpha release for now, so the model will continue to evolve over the coming weeks. Still, I want to touch on the main upgrades V6 brings to the table.
So for this last post of 2023, let’s take a peek at MJ V6 and what it can do.
This post might get cut off in some email clients. Click here to read it online.
What’s Midjourney V6?
Midjourney regularly releases new versions of the model, which include improvements to image quality and often entirely new features.
Let’s revisit and update our chronology from the Version 5 post:
Version 1: March 2022
Version 2: April 2022
Version 3: July 2022
Version 4: November 2022
Version 5: March 2023
5.1: May 2023
5.2: June 2023
Version 6 (alpha): December 2023
Even if we count the smaller 5.2 update, this is the longest we’ve had to wait for a new Midjourney release.
What’s new in Version 6?
While you can see the full announcement on Facebook or Discord, here are the three primary improvements:
Limited ability to spell
Better prompt understanding
Improved image coherence and quality
Let’s look at how each pans out in detail.
1. Text generation
Midjourney can finally do text!
As long as you keep your messages short, there’s now a decent chance that Midjourney will spell them correctly.
All you have to do is put the desired text inside “quotation marks” and keep it to about two or three words. Anything above that gets messy.
Let’s test:
Inspirational poster that says "Keep it short!"
Neon sign by a shady motel that says "Vacancy"
Hallmark card with the words "Happy Birthday!" on the cover
Vintage propaganda poster that says "This one's way too long, buddy!"
As you can see, Midjourney hasn’t quite caught up with DALL-E 3 when it comes to text rendering.
But considering how awful Midjourney’s spelling was before, being able to insert short text into your images is a big step forward.
2. Prompt adherence
Midjourney Version 6 is significantly better at understanding long and detailed descriptions.
The team made a point of highlighting that we may have to “relearn” how to prompt. I was personally thrilled to see them explicitly tell people to “…avoid 'junk' like "award-winning, photorealistic, 4k, 8k," which is what I’ve been preaching since January.
My own tests of prompt understanding typically involved asking for objects with specified shapes and colors on a table. The latest one was the following:
Two red balls and one blue cube on a green table
Here’s how Midjourney 5.2 handled it at the time:
And now…here’s Version 6:
This was the very first try, and apart from the extra ball in the first image, Midjourney V6 nails it!
Here’s a showcase with three more scenes, where I’ve picked the best image out of four.
Naive art drawing of three animals sitting together on tree stumps in the forest. The animal on the left is an owl wearing square glasses. The animal in the middle is a hedgehog wearing a flower wreath. The animal on the right is a squirrel
There are three baskets full of fruit on a kitchen table. The basket in the middle contains green apples. The basket on the left is filled with strawberries. The basket on the right is full of blueberries. In the background, there is a blank teal wall with a circular window.
Photo of an old man's face. He has bright blue eyes, a black wart on his nose, and is wearing a yellow cap. He is laughing and we can see that his teeth are crooked.
I don’t know if V6 was trained using a bespoke captioner like DALL-E 3, but it’s clear that we’ve entered a new era in Midjourney’s prompt comprehension. It’s now often enough to give a straightforward description in natural language to get what you need.
3. Image quality and coherence
As you probably already noticed, the quality of V6 output is incredible, especially when it comes to photographic images.
Here’s a random selection of prompts and resulting images:
Worm's eye view photo of a cyberpunk motorcycle wheel running over a puddle. Neon signs of the city reflect in the puddle.
Documentary footage of mother deer and her fawn walking through a snowy forest at dawn
Cinematic close-up of an angry clown's eye
Underwater photo of a colorful octopus
Landscape photo of an alien planet full of bioluminescent plants and tall, purple trees
Cute microscopic creature wearing a scarf
This level of quality, combined with Midjourney’s newfound ability to truly understand your requests, opens up a whole new world of opportunity for visual storytelling, illustration, and more.
Other notes
The launch announcement also mentions that Midjourney V6…
1. Is better at handling image prompts and remixing images.
2. Supports upscalers that increase resolution by 2x. There’s a “subtle” version that stays close to the original image and a “creative” version that may introduce new details. You’ll find them under your chosen individual image:
Here’s our original microscopic creature, followed by subtle and creative upscales:
Updated Midjourney version comparison
Let’s bring back the comparison images from my V5 post and add V6 to the mix:
Forest hut
Hamster photo
Hoverboard
Less than two years of text-to-image progress.
Pretty impressive, eh?
How to start using Midjourney V6
If you’re new to Midjourney, check out my step-by-step “getting started” guide (mostly steps 1-3).
If you already have a Midjourney account, there are two ways to use V6:
Option 1 (“permanent”):
This one’s best if you’re planning to mainly use Midjourney V6 going forward.
Type “/settings” in the chat to open the “Settings” menu:
Select Midjourney Model V6 [ALPHA] in the dropdown at the top.
From now on, any image you generate will use version 6 by default.
Option 2: (“per image”)
You can also simply add the “--v 6” suffix to any individual prompt:
This is useful if you prefer to stick to the current default model (V5.2) and only want to test V6 for specific types of images.
Have fun!
Over to you…
Have you already tried Midjourney V6? If so, what’s your take?
How do you think V6 will fare in the increasingly crowded text-to-image space? With the proliferation of solid, free alternatives, can Midjourney still justify its price tag?
As always, I’d love to hear your input. Leave a comment or shoot me an email at whytryai@substack.com.
Happy New Year, and I’ll be back in 2024!
I need to start using "worm's eye view"! That's simple and direct.
I'm glad to see that our professional artist can almost spell at an elementary school level! What a strange moment to find ourselves in.
Thanks as always for your clearly explained tutorials Daniel. For me, Mr. Geezer FussyButt, I'm waiting until they get it off of Discord. Then I'll dive it.