Sunday Rundown #58: Runway Gen-3 for Everyone & a Talking Boulder

Sunday Bonus #18: Using Gemini's video capabilities for meeting insights.

Daniel Nest

Jul 07, 2024

∙ Paid

Happy Sunday, friends!

Welcome back to the weekly look at generative AI that covers the following:

Sunday Rundown (free): I share this week’s AI news and an AI fail for your entertainment.
Sunday Bonus (paid): My paid subscribers get a goodie in the form of a guide, an AI tip, a tool walkthrough, etc.

Every Sunday Bonus In One Place

Let’s get to it.

🗞️ AI news

Here are this week’s AI developments.

👩‍💻 AI releases

New stuff you can try right now:

Runway made its latest Gen-3 Alpha text-to-video model available to everyone. (You’ll need at least a Pro account to use it though.)
Suno text-to-music now has an iOS app.
Perplexity introduced a more powerful Pro Search that breaks research down into several steps and analyzes data before spitting out an answer.
French AI lab Kyutai released a voice chat model called Moshi, which responds instantly in a way demoed by OpenAI in May.
ElevenLabs has dropped a Voice Isolator tool that can extract vocals from a recording while removing background noise.

🔬 AI research

Cool stuff you might get to try one day:

Meta AI is working on a state-of-the-art text-to-3D model called Meta 3D Gen, which generates better-quality 3D assets than existing models, faster.

📖 AI resources

Helpful stuff that teaches you about AI:

“How Far Can We Scale AI? Gen 3, Claude 3.5 Sonnet and AI Hype” (Video) - great summary of different perspectives on the limits of scaling by AI Explained.

From my sponsor:

Explore SciSpace: an AI platform for researchers. Browse 280M+ papers, conduct literature reviews, chat with PDFs, and get AI-powered summaries.

Use code WTAI40 for 40% off an annual subscription or WTAI20 for 20% off a monthly subscription.

Check out SciSpace

🔀 AI random

🤦‍♂️ 10. AI fail of the week

Talk about being stuck between a rock and a hard place. (Final version.)

Wide cartoon illustration: Sisyphus is pushing a giant boulder up a steep hill. The speech bubble from Sisyphus says 'First time, eh?' Next to Sisyphus is a modern competitive sport cyclist in a helmet trying to go up. The background shows a daunting, endless mountain path with twists and turns disappearing into the clouds.

Anything to share?

Sadly, Substack doesn’t allow free subscribers to comment on posts with paid sections, but I am always open to your feedback. You can message me here:

💰 Sunday Bonus #18: Extract insights from video meetings with Gemini (for free)

In the past year, we’ve seen an absolute avalanche of paid AI note-takers for online meetings.

And yet, as it turns out, free Gemini 1.5 Pro is exceptionally well-suited for this purpose.

It’s the only free LLM that can parse combined video + audio input out of the box. It is freakishly good at it. This means Gemini can pick up visual cues along with what’s being said to provide a holistic analysis of any meeting.
Its 2M-token context window means it can easily handle multi-hour meetings.

Unlike simple AI note-takers and summarizers, Gemini can also recommend follow-up steps, identify potential roadblocks and how to deal with them, spot points of disagreement and suggest compromise solutions, and much more.

In today’s guide, I’ll share:

The step-by-step process of getting Gemini to analyze a recorded meeting.
My 200-word “starter” prompt for extracting structured feedback from Gemini.
Ideas for other ways you can use Gemini in this context.

This is the most excited I’ve been to share a goodie with my paid subscribers.

Let’s get to it!

Why Try AI