Sunday Rundown #54: Stable Audio Open & A Very Naughty Cat
Sunday Showdown #14: Stable Audio Open vs. ElevenLabs: Who makes the most realistic sounds?
Happy Sunday, friends!
Welcome back to the weekly look at generative AI that covers the following:
Sunday Rundown + AI Fail (free): I share this week’s AI news and a fail for your entertainment.
Sunday Showdown + AI Tip (paid): I pit AI tools against each other and share a hands-on tip for working with AI.
On today’s Sunday Showdown, we’ll see who is best at making sound effects: Stable Audio Open or ElevenLabs.
Let’s get to it.
🗞️ AI news
Here are this week’s AI developments.
👩💻 AI releases
New stuff you can try right now:
Stability AI released an open-source text-to-audio engine called Stable Audio Open. It lets you generate sound and music samples of up to 47 seconds.
Google made NotebookLM more useful by upgrading it to Gemini 1.5 Pro, adding support for Google Slides and URLs as sources, and more. (I’ve been a fan of NotebookLM for a while.)
Udio introduced the option to upload your own audio to serve as a starting point for generating new tracks. (Paid users only for now.)
ChatGPT mobile app now lets you use voice chat in “Background Conversations” mode, so you can continue talking while using other apps or your screen is off.
PixVerse video platform added a Magic Brush feature that works just like Runway’s “Motion Brush,” letting you control which parts of the image to animate.
Domo AI launched new art styles for its video-to-video function.
Chinese researchers unveiled ToonCrafter, which can seamlessly fill in and animate the “gap” between two given cartoon frames to turn it into a short video.
🔬 AI research
Cool stuff you might get to try one day:
Chinese video platform Kuaishou Technology unveiled its answer to OpenAI’s Sora called Kling, which can make high-definition videos of up to 2 minutes from a single text prompt. (You can technically already try it, but you’d need a Chinese phone number.)
NVIDIA demoed Project G-Assist which uses AI to offer helpful in-game tips to gamers.
Google is reportedly working on a “memory” feature for its Chromebooks that will work a bit like Microsoft’s controversial “Recall” but hopefully without the creep factor.
📖 AI resources
Helpful stuff that teaches you about AI:
“Extracting Concepts from GPT-4” - OpenAI’s deep dive into new methods to make sense of how LLMs work (a la what Anthropic did two weeks ago).
“Zoom CEO Eric Yuan wants AI clones in meetings” - interview by Nilay Patel (Decoder).
🤦♂️ 10. AI fail of the week
Before this cartoon, the cat’s research took it to some strange places:
Anything to share?
Sadly, Substack doesn’t allow free subscribers to comment on posts with paid sections, but I am always open to your feedback. You can message me here:
⚔️ Sunday Showdown #14 - Stable Audio Open vs. “Sound Effects”: Who makes better sounds?
In March, I tested the beta version of ElevenLabs’ “Sound Effects” against Meta’s “Audiobox.”
Here in June, ElevenLabs released “Sound Effects” to everyone and Stability AI just launched Stable Audio Open (see above).
As such, a rematch is in order.
Can Stable Audio Open give ElevenLabs’ “Sound Effects” a run for its money?
There’s only one way to find out…