10X AI (Issue #21): Meta AI, ChatGPT Upgrades, Art Tools, and An Overkill Knife
PLUS: Pika Labs text messages, NExt-GPT multimodal LLM, Getty AI image generator, and how to preview DALL-E 3 using the Bing Image Creator.
Happy Sunday, friends!
Welcome back to 10X AI: a weekly look at beginner-focused AI news, tools, and tips.
Let’s get to it.
This post might get cut off in some email clients. Click here to read it online.
🗞️ AI news
Here are this week’s AI developments.
1. Meta’s AI extravaganza
Last week, Google and Microsoft started plugging AI into their ecosystems.
You’ll never guess who did it this week!
Oh, you read the title and can guess it easily? Carry on, then.
Meta has a whole slew of “AI Experiences” coming up:
A Meta AI assistant users can interact with on WhatsApp, Messenger, Instagram, and so on
28 unique AI personalities voiced by different celebrities
Ability to generate unique digital stickers using Meta’s proprietary text-to-image model called Emu
Finally, you’ll be able to restyle your images and change their backgrounds using AI directly in Meta’s suite of products
Some of the features are in US-only Beta for now but should roll out elsewhere in the coming weeks.
2. ChatGPT gets a boost
Not content with just teasing the imminent DALL-E 3, OpenAI is now also beefing up ChatGPT.
This includes:
The ability to have ChatGPT respond to voice queries using a selection of natural-sounding AI voices
Finally granting ChatGPT the same multimodal image recognition powers Bing has had for a few months
The above is rolling out to Plus and Enterprise users over the next two weeks.
In addition to this, ChatGPT can now browse again, after having this feature suspended earlier this year for paywall-bypassing reasons.
If you’re a paid user, you should already be able to browse in ChatGPT.
Go to Beta features and turn on “Browse with Bing”:
After that, you’ll see “Browse with Bing” as a dropdown selection under GPT-4:
Enjoy your old-new Internet surfing powers!
3. Pika Labs can now add text to your videos
Lately, Pika Labs’ free, Discord-based text-to-video generator is giving Runway a run for its money. (Yes, I went there.)
This week, Pika Labs released a nifty feature that lets you incorporate text into your short videos.
For the video teaser, the Pika team seems to have borrowed the recipe for skull-shattering music tracks from Adobe’s launch playbook:
It works, too:
If you’re on Discord, give Pika a whirl. It’s free, remember?
4. NExt-GPT and any-to-any multimodality
While ChatGPT is learning to see, hear, and speak, this LLM is taking it all the way to near-complete multimodality.
Awkwardly named NExt-GPT is an any-to-any multimodal large language model that can take text, image, audio, and video input in any combination and generate output in the same modalities.
Here’s a video walkthrough:
Being able to move beyond text chat is the next frontier of human-to-AI interaction, and this is an exciting step in that direction.
5. Getty Images launches an in-house image model
After being one of the first to ban AI-generated content in September 2022, Getty Images has now announced its own text-to-image AI generator.
The main selling point is that it’s trained exclusively on Getty’s library of images and its output is commercially safe: no legal concerns, no ethical issues, etc.
The model is in a “Request Demo” stage, so it’s not possible to test it directly. I guess time will tell whether there’s a market for Getty’s “worry-free images” play.
🛠️ AI tools
Today I’d like to look at a few niche AI art sites. There are dozens (if not hundreds) of places where you can generate images using Stable Diffusion and other image models.
But these three focus on doing one specific thing really well.
6. KREA
If you’ve been on social media lately, you’ve likely seen viral images where objects in a seemingly ordinary picture form the shape of a larger “hidden” visual. To see it, you have to either squint your eyes or stand far enough from the picture.
KREA lets you do this using two separate tools: one for patterns or text and one for logos. (Although you can use either one to upload your own “hidden” image to be baked into the generated one.)
I used the Apple logo with “apple orchard” as the prompt:
Here’s one of the resulting images:
This one’s way too obvious to count as an illusion, but you get the picture.
KREA also has a waitlist for their complete design tool, but the two illusion makers above are completely free to try.
7. LeiaPix Converter
LeiaPix Converter creates depth animations to give images a 3D vibe.
You simply upload an image…
…optionally tweak some of the settings…
…then get your animation:
LeiaPix Converter gives you a generous amount of starting credits and you can even export animations for free if you don’t use the full-resolution settings.
8. QrGPT
This was yet another passing craze a few months ago: QR codes with artistic backdrops.
QrGPT lets you make your own using AI-generated imagery.
Just type in the URL you want your QR code to point to along with the prompt for the image:
In around 10 seconds, you’ll have a functional QR code with your chosen AI-generated background:
QrGPT is completely free to use.
💡 AI tip
Here’s this week’s tip.
9. Get a free taste of DALL-E 3 in Bing Image Creator
DALL-E 3 was my top item in the latest issue, with the official release coming to ChatGPT Plus and Enterprise users in October.
But…it’s apparently already here!
It looks like Microsoft has quietly released DALL-E 3 inside the Bing Image Creator.
If you go to https://www.bing.com/create and log in with your Microsoft account, you can start generating images using the latest model.
I tested it out with the long prompt from the last issue. Not only does Bing faithfully follow every part of the description, but the resulting image is uncannily similar to the demo one from OpenAI.
Here’s the showcase image again:
And here’s what I got:
This is a level of prompt adherence I really haven’t experienced before.
(To be fair, two of the four generated images weren’t nearly as spot on.)
Of course, this is still a WIP release with several limitations:
Bing Image Creator can only make square images
You can’t discuss and iterate on the resulting image conversationally, as you’ll be able to do with ChatGPT
The underlying model may still be slightly different. There’s not much official info about it.
But hey, it’s a free sneak peek at the future of text-to-image: Go have fun with it!
🤦♂️ 10. AI fail of the week
I like Swiss Army Knives as much as the next guy, but this seems excessive.
I'm still using Bing's image generator every day, or just about (today I embedded YouTube videos instead, but yesterday I definitely used it). I'll be a keen observer of what sorts of "improvements" have been made.
One thing I noticed was that Bing (Dall E) seems to be really slow right now. I also hit a snag asking for something in the style of a particular artist; I'm wondering if one improvement was more on the legal front.