21 Comments

I don't have any brilliant things to add today (most of those tend to go over at AI Jest), but I would add that image generation is really fun now. It used to be amazing, but also kind of frustrating. I think we're still in the "fun but frustrating" camp today, but sliding more and more toward getting something useful and beautiful with less additional prompting and editing.

If you have a lot of text you want to break up with images, you can also ask for suggestions, and GPT4 is really good at giving them now.

Expand full comment
author

Yup, that's my take as well. Image generation is truly available to the masses now, and you can go a long way by simply describing what you want without additional technicalities.

Expand full comment

Great article. We stumbled on getting LLMs to create images prompts for us when we were trying to automate a pipeline of work. We then 'illustrated' this idea of Chaining Tool AI prompts together in a project on visualizing dad jokes - for an example see https://open.substack.com/pub/thetaonpi/p/spaghetti-images?r=2unyem&utm_campaign=post&utm_medium=web

Expand full comment
author

Nice! That was a fun behind-the-scenes breakdown of iterative prompting that I always encourage people to follow. Well played!

Expand full comment
Jan 19Liked by Daniel Nest

You're a mind reader Daniel. I'm about to dive back in to Gen AI, and your article is pretty much exactly what I was about to search for. Looks like ChatGPT Pro is up next for me.

Well done, thanks!

Expand full comment
author

Before you jump head first into ChatGPT Plus, I'd suggest starting with Microsoft Copilot to see if DALL-E 3 gives you the look you're going for. You're a Stable Diffusion guy, from what I remember, so you'd probably want to see how DALL-E 3 compares. Then if you're happy, you can splash on a $20 per month ChatGPT Plus subscription.

Expand full comment
Jan 19Liked by Daniel Nest

Thanks. You introduced me to Dalle earlier, and I liked it very much. But the Microsoft site is a freebie, so it kept booting me out. I decide to pursue Dalle elsewhere, which is what I now hope to do.

Look before I leap is good advice though. Have you done a review of ChatGPT Plus? I looked around in your article list but didn't see it. Maybe I didn't look hard enough?

Also, another thing I'm on the hunt for are tips about how to generate fiction in ChatGPT. So far all I've done is pop a hastily considered one sentence prompt in to ChatGPT 3.5 and then copy/paste publish whatever it gave me. You know, very basic usage. Got a lot to learn yet.

I expect to use both Dalle and SD via Dezgo, at least for starters.

Thanks again.

Expand full comment
author

I have indeed done a deep dive into all ChatGPT Plus features back in October:

https://www.whytryai.com/p/chatgpt-plus-upgrades-review

Things have improved since then, and you no longer need to start separate chats per feature. Every feature is integrated into every new chat, as long as you have ChatGPT Plus.

Expand full comment

Thank you!

Expand full comment

Wow, this was really useful. I’m pretty much against using AI images for now, they seem very lifeless to me still, but I’m open to changing my mind. I loved the thumbnail, by the way. Really unique and interesting.

Expand full comment
author

Hey Andrei - happy to hear you found the post useful.

I can only encourage you to take the current generation of text-to-image models for a spin. You might be surprised by what they're capable of these days. Almost all of them can be tested for free (except Midjourney), too.

(The thumbnail is Midjourney V6.)

Expand full comment

Thanks for the tip! Will make sure to give them another try soon.

Expand full comment
author

If you do end up testing them out, I'd love to hear your thoughts.

Expand full comment
Jan 18Liked by Daniel Nest

Thanks Daniel ! Have you tried real-time image/text/camera to image AI Kreai.ai ? Really fascinating!

Expand full comment
author

I have actually! I showcased KREA, Leonardo's Live Canvas, and Pikaso by Freepik in this issue of 10X AI: https://www.whytryai.com/p/10x-ai-31-google-gemini-meta-emu-live-canvas-ai

As far as I can tell, they all use some variations of Stable Diffusion under the hood, judging by the renderings.

Agreed, real-time rendering is very fascinating. We've come a long way from when I first got into AI thanks to AI images.

Expand full comment
Jan 18Liked by Daniel Nest

Oh I missed it! I'll check right away :) Thanks again for sharing all the good stuff, I install this app thanks to you so it's a x3 thanks for you Daniel. Keep it up !

Expand full comment

AI Challenge: use this process to recreate the Cronenberg dog from Midjourney V1 (without remixing the original image).

Expand full comment
author

Looks like that kind of existential terror is reserved for accidental, spontaneous occurrences.

We had a strong start, with ChatGPT against all odds recognizing that it's a dog:

"The image depicts a surreal and abstract representation of a dog. The dog's body appears predominantly white, with a smooth texture that resembles a fine oil painting. Its face is contorted in an unusual way, with a large, eye-like aperture where one would expect its forehead to be. This circular opening reveals a deep, hollow space, adding a haunting element to the creature's visage." (There are four more paragraphs of description after this)

It then suggested this prompt:

"A surreal portrait of a white dog with a hauntingly hollow, eye-like aperture on its forehead. Its eyes are mismatched, with the left one being oversized and rich brown, while the right one is smaller and darker. The dog's snout is short with pinkish nostrils, and its slightly open mouth reveals a hint of teeth and a visible tongue. The background is a gradient from golden yellow to burnt orange, creating an atmospheric glow. Visible brushstrokes suggest a classical oil painting technique, blending the lines between reality and fantasy in a striking and memorable visual experience."

A very kind and forgiving description, if you ask me.

Nothing came close to replicating the raw nightmare of the original.

Here's the "worst" of Midjourney V6:

https://i.imgur.com/drfL1wq.png

And here's DALL-E 3 in ChatGPT:

https://i.imgur.com/GOf7PAU.png

Amateurs!

Expand full comment

This makes me think that artists might try and save early Stable Diffusion checkpoints in order to have a tool for surrealist art - kind of like how some musicians still use synths from the 70s and 80s, because they produce a sound that isn't easily reproducible with modern hardware.

Expand full comment
author

Totally. It'll be the retro look of simpler times of 2022 within a few months.

Expand full comment
author

Ha, challenge accepted - although I'm afraid of what I might stumble upon.

Expand full comment