Sunday Rundown #64: Google Goodies & Brain Explosion
Sunday Bonus #24: How to make a character enter a scene using image-to-video.
Reminder: Upcoming Midjourney workshop
My live walkthrough of the Midjourney website and prompting basics—hosted by Charlie Guo of Artificial Ignorance—is on August 30 at 11 AM PST.
If you’re a paid subscriber to Why Try AI, see event details here.
If you’re a paid subscriber to Artificial Ignorance, see event details here.
Happy Sunday, friends!
Welcome back to the weekly look at generative AI that covers the following:
Sunday Rundown (free): this week’s AI news + a fun AI fail.
Sunday Bonus (paid): a goodie for my paid subscribers.
Let’s get to it.
🗞️ AI news
Here are this week’s AI developments.
👩💻 AI releases
New stuff you can try right now:
Anthropic introduced “prompt caching,” which lets developers significantly reduce costs (by up to 90%) and latency (by up to 85%) by caching frequently used context between API calls.
Google did a few things this week:
Made its Imagen 3 model available to more US customers via its AI Kitchen. (It was part of my big text-to-image comparison two weeks ago.)
Added new features to AI Overviews in Search and made them available in more countries.
OpenAI reclaimed the #1 spot on LMSYS Chatbot Arena Leaderboard with its latest version of GPT-4o.
Midjourney combined several website features like “Reframe” and “Repaint” into a new, more intuitive web editor. (Also, you can now unlock the website after only 10 generations in Discord.)
Nous Research open-sourced Hermes 3, a fine-tuned LLM based on Llama 3.1 that unlocks “deeper capabilities in reasoning and creativity.”
Perplexity will now show real-time predictions and odds for relevant searches through a partnership with Polymarket.
Runway launched Gen-3 Alpha Turbo, a speedier version of its latest-generation video model. It’s 50% cheaper and 7x faster than Gen-3 alpha and is even available to free accounts (as long as you have some free credits left).
xAI had a beta release of its latest LLMs, Grok-2 and Grok-2 mini. Grok-2 is competitive with many top-tier models. Both LLMs are available on X/Twitter to Premium users.
🔬 AI research
Cool stuff you might get to try one day:
Cosine introduced a new AI software engineer called Genie, which scores above 30% on the SWE Bench benchmark. (Join the waitlist.)
Exists AI is building a platform that lets people create and customize 3D games using nothing but text prompts. (Sign up for Beta.)
Google hosted its Made by Google event, with a lot of focus on upcoming AI features:
Gemini Live—Google’s answer to OpenAI’s “Advanced Voice Mode”—is starting to roll out to paying Gemini Advanced subscribers.
Gemini will soon connect to even more apps and become more integrated with your Android phone.
Android phones will get new AI-powered accessibility features.
Pixel 9 phones will ship with AI features like “Add Me” (stitch group shots together to include everyone), “Pixel Studio,” and more.
Tsinghua University researchers developed a novel LongWriter framework for getting LLMs to generate over 10K words of output in one go. (GitHub code.)
📖 AI resources
Helpful stuff that teaches you about AI:
How Meta animates AI-generated images at scale - a curious look at how Meta is capable of serving billions of request for image animations.
How To Make a Movie Teaser With AI - my step-by-step guide to making a movie trailer using nothing but free AI tools (in case you missed it).
KPMG GenAI Survey [PDF] - Insights on how leaders are using AI by KPMG.
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery - a deep dive into a novel, LLM-powered system for fully automatic scientific discovery by Japanese research lab Sakana AI.
Unreasonably Effective AI with Demis Hassabis [VIDEO] - great interview by Professor Hannah Fry with Google DeepMind CEO Demis Hassabis.
🔀 AI random
Other notable AI stories of the week:
Meta and Universal Music Group announced an expanded global agreement to protect human creators and ensure fair compensation.
🤦♂️ AI fail of the week
“Don’t make him think. You won’t like him when he’s thinking.” (Final version.)
Anything to share?
Sadly, Substack doesn’t allow free subscribers to comment on posts with paid sections, but I am always open to your feedback. You can message me here:
💰 Sunday Bonus #24: How to make a character enter a scene with Kling AI or Luma
While playing around with Kling and Luma for my last post, I discovered a pretty neat method for creating a video of a character entering a scene.
Below, I describe the step-by-step process, as well as the tools you’ll need.
The best part? This can be done 100% free.
Let’s go!