20 Comments

That's fun!

Expand full comment
author

It is, isn't it?

Expand full comment
Mar 15Liked by Daniel Nest

Great job on these. I am still trying to get one of these tools to accurately generate images of hands with all five fingers, in a single close-up, just changing the nail polish color and skin tone. Any thoughts on the best tool for this?

Expand full comment
author

Well, Midjourney is actually pretty consistent at making anatomically correct hands now.

But keeping the same exact image while changing skin/nail colors isn't a built-in feature. I just tried doing this, and even if you keep the seed the same but change the nail color, the images are different.

Then again, if all you're after is keeping the exact same hand, I'm sure there are third-party tools that let you select nails and change them. You could even consider giving Generative Fill in Adobe Firefly a go for that!

Expand full comment
Mar 14Liked by Daniel Nest

The consistent character GPT seems far simpler, but I'm not qualified to do a head to head comparison with Midjourney. All these tools have their limitations, and I'm looking forward to exploring Midjourney as an option to ChatGPT once they make the website fully available.

I'm learning to upload an image I want to work with and ask ChatGPT to describe it. This tells me what language ChatGPT is using for that image. Then I'll make a template including ChatGPT's own description for additional image generations. This seems to really help keep characters consistent, though again, perfection is typically not an option. Most of the time it's close enough though.

Expand full comment
author

Yeah the big win here is that Midjourney is much better than DALL-E 3 for photographic imagery, so being able to use it for consistent character generation is huge for virtual photoshoots, etc.

And the process you've described is very similar to this tip I've shared back in the day: https://www.whytryai.com/p/10x-ai-22-canva-magic-studio-linkedin-zoom#%C2%A7mimic-styles-without-relying-on-artists-names

It's about using image recognition to recreate+describe the image, so you can reuse that style.

The Midjourney website should go live relatively soon for everyone, as they've now lowered the entry requirement to 1K images generated (used to be 10K). So my guess is a couple of months at most, if not sooner.

Expand full comment
Mar 14Liked by Daniel Nest

ChatGPT/Dalle is experiencing a meltdown today, so Midjourney is sounding ever more interesting. I'm guessing I'm not going to be able to get serious work done with just one image generator. We'll see...

Thanks as always for all your useful information!

Expand full comment

Oh wow, this is going to unlock a ton of new use cases for people just trying to make things for themselves. Any insight into how they're able to do this technically? I'm very curious on what changes they made to the model to make this work.

Expand full comment
author

Midjourney don't really share much about their inner workings, unfortunately. Even during office hours, the focus tends to be on what's coming and what's being worked on, but not the "how."

But it appears they found a way to consistently isolate styles, aesthetic details, and specifics in their own images. That's what makes the style tuners possible (selecting your preferred pictures so they can blend them into a unique, reusable style). That's also what's behind --sref (style isolation), and now it appears they found a way to use a similar approach to isolate the subjects.

If you do stumble upon a technical explanation, I'd love to hear it! (And I will likewise let you know if I see anything.)

Expand full comment
Mar 14Liked by Daniel Nest

Interesting, how do you feel about Midjourney still operating on Discord? (Feels kind of amateurish to me)

And have you tried Artflow? It’s been doing this for a while (though has some of the same problems and utalizes a freemium model)

Expand full comment
author
Mar 14·edited Mar 14Author

Yeah I've never been a fan of the Discord interface, but I've gotten quite used to it over time.

Then again, I've had access to the Alpha website since December and have been using that exclusively. Wrote about that here (https://www.whytryai.com/p/10x-ai-32-midjourney-alpha-google-imagen-2).

I use Discord for the tutorials to make sure others can follow.

But Midjourney has since lowered the criteria for having website access to 1K generated images, and they're gearing up for a mass rollout soon. So I guess within a few months, everyone who wants a website will be able to us it!

I did check out Artflow back in the day. There are many great alternative UI's like Leonardo, Playground, etc. I'm so used to Midjourney and DALL-E 3 in ChatGPT Plus that I haven't been looking for additional interfaces to use actively.

Are you a frequent Artflow user?

Expand full comment
Mar 14Liked by Daniel Nest

Gotcha gotcha, my image generation needs are pretty few and far between so I mostly use Dall-e

I’m more just waiting for a company to be able to render fully animated stories with consistent characters over long runtimes which I why I mentioned Artflow

Obviously we’re not there yet, but Sora, it seems, gets close

Expand full comment
author

Yeah Sora is pure witchcraft, and Mira Murati said in a recent interview that she expects a public release definitely this year and maybe in a few months.

Expand full comment
Mar 14Liked by Daniel Nest

I heard it takes ten minutes to render, but that’s totally worth it for the quality, especially if it’s bundled into the GPT subscription

Expand full comment
author

Yeah 10-minutes for a 1-minute magical video based purely on a text prompt is nothing. And if recent history is anything to go by, we should expect the speed to improve quickly and the costs to come down as well after it launches.

Expand full comment

Completely agree

Expand full comment
Mar 14Liked by Daniel Nest

I can imagine what you gonna do with Claude's API using ASCII characters. You certainly have a talent! Keep rocking!

Expand full comment
author

ASCII art, the final art frontier!

Expand full comment

I know I'm different than most, but I'm usually not happy until I've chained together three different apps anyway.

Expand full comment
author

Why settle for less when you can have more?!

Expand full comment