Sunday Rundown #81: Xmas Miracles & Goro Dog
Sunday Bonus #41: Santa's virtual gift visualizer.
Tis the holiday season, so this is my final Sunday Rundown of 2024. I’ll catch you up on all the news when I return on January 5. Merry Christmas and a Happy New Year!
Happy Sunday, friends!
Welcome back to the weekly look at generative AI that covers the following:
Sunday Rundown (free): this week’s AI news + a fun AI fail.
Sunday Bonus (paid): a goodie for my paid subscribers.
Let’s get to it.
🗞️ AI news
Here are this week’s AI developments.
👩💻 AI releases
New stuff you can try right now:
ElevenLabs dropped Flash v2.5, a text-to-speech model with extremely low latency (75ms) for humanlike real-time conversations in 32 languages:
Google keeps on chugging on:
Gemini 2.0 Flash Thinking Mode “thinks” through its answers before responding, much like OpenAI’s o1, except it’s free. Try it on Google AI Studio (select Gemini 2.0 Flash Thinking Experimental in the “Model” dropdown).
Imagen 3 has gotten even better at creating high-quality, prompt-adherent images and now beats all competition on Human Evaluation benchmarks.
NotebookLM is now powered by Gemini 2.0 Flash, has a better interface, and lets you speak to the hosts to ask questions and steer conversations.
Veo 2 video model is now truly state-of-the-art, generating far more consistent and prompt adherent videos than any existing competitors. Looks like I’ll have to make a new version of my video model comparison in 2025. (Sign up for the waitlist.)
Whisk is a super fun experimental image tool that lets you upload references and specify subjects, scenes, and styles to generate and remix images:
Kling AI released an upgraded 1.6 version of its video model with better aesthetics, prompt adherence, and physics simulation:
Microsoft upgraded the Bing Image Creator with the latest version of DALL-E 3, a new interface, and an easier way to share your creations.
Midjourney expanded its personalization features, and you can now personalize the model by uploading curated images into custom “Moodboards.”
OpenAI wrapped up its “12 Days of OpenAI” with five final announcements:
Day 8: Search in ChatGPT has been improved, can be used with Advanced Voice Mode, and is now also available to free logged-in users:
Day 9: Lots of developer-focused OpenAI o1 improvements and features:
Day 10: You can now talk to ChatGPT on the phone by calling 1-800-CHATGPT in the US (or message it via WhatsApp from elsewhere).
Day 11: ChatGPT can now see and work with more apps on your desktop and integrates with the Advanced Voice mode.
Day 12: This is huge. The new o3 reasoning model is a successor to o1 that proves two things: First, OpenAI is getting progressively worse at naming its models. Second, we’ve reached an entirely new paradigm in AI reasoning. o3 smashes benchmarks that were thought to be near-impossible for AI to beat.
provides an exhaustive look at just how big of a deal o3 is. (As does AI Explained in the “AI Resources” section below.)
Pika came out of hibernation with Version 2.0 of its video model, which looks better, responds more accurately to prompts, and lets you add reference images to steer its output.
Suno has made its newest V4 music model available to free users. (Try it here.)
🔬 AI research
Cool stuff you might get to try one day:
Genesis is a massive collaboration project to create a “universal physics engine” that turns natural language requests into fully realized visual simulations:
Head of Instagram Adam Mosseri teased upcoming AI-powered video creation tools and visual effects:
Odyssey ML introduced Explorer, a generative world model that turns text prompts into moving world simulations or Gaussian splats:
📖 AI resources
Helpful AI tools and stuff that teaches you about AI:
“Alignment faking in large language models” - Anthropic found that LLMs can pretend to follow safety protocols when given sufficient incentives.
“Deliberative alignment: reasoning enables safer language models” - conversely, OpenAI shows that better reasoning capabilities can lead to better alignment.
“o3 - wow” [VIDEO] - AI Explained is known for sober takes without hype, so the title should give you some indication of how big a deal o3 is:
🤦♂️ AI fail of the week
“Ant! Ant! ANT?! Where the hell is that dog?!”
(Used in my last article after corrective interventions.)
💰 Sunday Bonus #41: Turn your Xmas wishes into virtual gifts from Santa
In the spirit of the season, today’s bonus makes your wishes come true.
Sort of.
I made a silly tool that turns your Christmas wish into a whimsical illustration of Santa trying to bring it to you.
Is it groundbreaking stuff?
Not really.
Is it a seasonally appropriate way to have a bit of fun?
It sure is!