3 Awesome AI Toys We’ll Get Very Soon
Here are a few imminent launches in the world of AI, expected somewhere around Q1 2023. This year's generative AI scene is shaping up to be a wild ride.
Welcome to 2023, you beautiful maniacs!
If you thought 2022 was crazy with all the relentless AI developments, you ain’t seen nothing yet.
Here’s a prediction from OpenAI’s co-founder Greg Brockman:
And here’s one from Ben Tossell, who runs the daily Ben’s Bites newsletter:
Could two people with the exact same forecast ever be wrong?!
Uh, yes. Obviously.
But my prediction is that their predictions are spot on.
This year, AI-based tools will seep ever deeper into the mainstream.
This article from Information Age and this one from Forbes argue that AI will have a wide-ranging impact on search, healthcare, banking, transportation, and much more in 2023.
Here on “Why Try AI?” my more humble focus is on stuff that’s easily accessible to regular people like myself.
In that space, I’m excited to play with three things that should come out in the first quarter of the year.
1. GPT-4 is so, so close
ChatGPT dropped in late November 2022 and gained 1 million users in less than a week.
It was the first time GPT-3 made the jump from niche copywriting communities and business applications to true mass adoption.
(Technically, Chat-GPT is a sort of polished “GPT-3.5” version but it’s largely based on the same model.)
If you’re not up to speed on GPT-3, I showcased some of its capabilities in my post about Lex, the free AI-powered word processor.
The next iteration of the model—GPT-4—is poised to blow GPT-3 out of the water. And it’s coming soon.
We’re talking max a few months, if all the buzz is true.
Here’s what GPT-4 might deliver in practical terms:
Broader data set: GPT-4 will be anchored in a vastly larger volume of training data and be up to speed on recent events (GPT-3 only knows stuff up to late 2021).
Better comprehension: The model will be even better at understanding natural language commands, requiring less nuanced prompting to get relevant results.
Extra coherence: GPT-4 should output text that’s even more thorough and competent.
Some have speculated that GPT-4 might be multimodal (able to work with e.g. images and video in addition to text), but OpenAI’s CEO Sam Altman has previously explicitly stated it wouldn’t be.
Regardless, there’s little doubt that GPT-4 will be a very big deal:
Let’s hope we soon get the chance to test this out for ourselves!
2. Midjourney V5 might drop at any moment
Released in November 2022, Midjourney Version 4 is what finally got me aboard the Midjourney train.
It was a massive step up from the somewhat unpolished feel of V3, bringing better image coherence with less need for complex prompting.
Here are a few comparisons using the same prompts:
The ability of Midjourney V4 to render incredible images with minimal effort makes it the current reigning champion within the text-to-image space, in my opinion. I recommend it as the best image generator for beginners, despite its clunky-ish Discord interface.
And, well, looks like V4 is soon about to be old news.
In mid-December, Midjourney founder David Holz said that Version 5 of the algorithm should go live around January 2023.
If my calendar is to be trusted, that’s this month!
During yesterday’s Discord office hours, David indicated that—while they’re not in a rush to push V5 out—the version is progressing nicely and solves a number of V4’s issues.
So even by conservative estimates, we’re likely just a few weeks away from playing with the next generation of Midjourney’s text-to-art tool!
3. Text-to-3D is (sort of) already here
2D images are cool, but 3D renders are at least 50% cooler!
In a recent poll on my Facebook page, you surprised me by ranking text-to-3D as your most anticipated AI development:
In the ancient past of September 2022, Google unveiled a research paper on DreamFusion, which could synthesize 3D images from text prompts.
Here’s a good look at it by Károly Zsolnai-Fehér of Two Minute Papers:
Shortly after that—as I briefly touched upon—NVIDIA presented its own 2D-to-3D model:
But both of those were ongoing work-in-progress concepts. The world at large couldn’t try them out or see a working demo.
Enter OpenAI.
In late December 2022, right in time for Xmas, OpenAI revealed an open-source model called Point-E, capable of generating 3D renders using point clouds:
Point-E is already available for developers to work further on.
“But I’m not a developer!” you scream at the screen in front of you, “I don’t want to hear about hypothetical mumbo-jumbo that I can’t even use right this second!”
Rejoice, you impatient hooligan, for there’s a way for you to test drive Point-E for yourself.
Behold “The Point-E text-to-3D demo” on Hugging Face!
Check out this spaceship it just generated for me:
The demo space lets you customize a bunch of settings and create 3D point clouds from text input, image input, or a hybrid (text-to-image-to-3D). You can then zoom, rotate, and otherwise interact with the result.
Neat, right?
The demo quality isn’t exactly Hollywood-level CGI, but damn…we’re making 3D stuff from nothing but text in seconds, people: Cut Point-E some slack!
Over to you…
What AI tools, apps, or gadgets are you personally looking forward to? Did I overlook some major upcoming launches?
I love discovering new AI stuff, so don’t hesitate to email me or leave a comment below this post.
See you next week!
Great summary!
I'm very much looking forward to both GPT-4 and Midjourney V5. The public adoption of AI images seems to be much slower than text, so it's particularly interesting to see what comes of the next gen ChatGPT.
This might be a strong opinion, but to me roughly half of all text-based content on the internet could be replicated by GPT-3. It stands to reason GPT-4 will close that gap to... 80%? 95%?
All those poor communication/journalism majors piling into debt right now.