10X AI (Issue #29): Stable Video, ChatGPT Link Shortener, and a Taxi Diver [sic]
PLUS: Claude 2.1, Inflection-2, YouTube-watching Bard, ElevenLabs speech-to-speech, Amazon's AI Ready, Luma Labs upgrades, and two smaller LLMs.
Happy Sunday, friends!
Welcome to 10X AI: a weekly look at beginner-focused AI news, tools, and tips.
Let’s get to it.
This post might get cut off in some email clients. Click here to read it online.
🗞️ AI news
Here are this week’s AI developments.
1. Stability AI joins the text-to-video club
I guess this was inevitable.
After all, text-to-video is on the rise.
Meta announced its Emu Video just last week.
Now, it’s Stability AI’s turn.
The Stable Video Diffusion model is officially in research preview and appears to compare favorably to current leaders like Runway and Pika Labs:
Here’s a teaser from Stability AI:
If you know what you’re doing, the model code is on GitHub.
If you don’t mind signing up for an account, you can use the free preview on Decoherence.
You can also try this image-to-video demo on Hugging Face, but it’s slow and prone to crashing at the moment.
2. Claude 2.1 is here!
Not to be outdone by ChatGPT’s recently expanded context window, Anthropic just released Claude 2.1.
The model now has the largest context window in the game (again) with a whopping 200K tokens, which is roughly 500 pages of text.
Claude 2.1 is also more accurate and 2x less likely to hallucinate wrong answers:
Finally, developers can now provide system prompts (custom instructions) to the model and integrate Claude 2.1 with their internal processes and APIs via so-called “tool use.”
You can test the model yourself via Anthropic’s chat interface.
3. Inflection-2 is also here!
One LLM per week is no longer enough.
Inflection released version 2.0 of their model with “much improved factual knowledge, better stylistic control, and dramatically improved reasoning.”
In fact, the company claims that Inflection-2 is now the best-performing large language model after GPT-4. Here are the MMLU benchmark scores:
Inflection-2 is apparently also great at math and coding:
Soon, Inflection-2 will power the company’s friendly (and free!) Pi chatbot.
4. Google’s Bard can now watch YouTube
After ChatGPT recently killed all the “chat with your PDF” startups, Bard is coming after the “chat with video” apps.
Bard has gotten a pretty cool upgrade that lets it understand and discuss YouTube videos. In my limited testing, it’s quick and accurate.
For instance, I asked it to summarize this one-hour Kurzgesagt video. About 5 seconds later, here’s the result:
Give it a spin at bard.google.com.
5. ElevenLabs launches speech-to-speech
ElevenLabs is already one of the leaders in text-to-speech.
Now, the company also has an impressive speech-to-speech feature.
It’s uncannily great at capturing the pace, tone, and other characteristics of the input voice. I made a few tests by simply speaking into the mic and picking some of the premade ElevenLabs voices.
Here’s Mary Poppins endorsing my newsletter:
And here’s whoever this guy is:
The best part?
You can try it for free on their AI Voice Changer page.
6. Amazon wants to teach AI to the masses
Amazon is jumping on the AI education bandwagon by offering free AI courses and skill training via its new “AI Ready” program.
This encompasses a whole range of initiatives including sponsored scholarships for eligible students and free self-learning courses via AWS Skill Builder and AWS Educate.
You can explore and take many of the courses already now: Just follow the above links.
7. A bunch of smaller open-source models
After Claude 2.1 and Inflection-2, we’re still not quite done with new LLM launches.
We also got a few noteworthy open-source releases:
Neural Chat 7B (V3.1) from Intel. It was fine-tuned on Mistral 7B and outperforms it on most measured LLM benchmarks.
Orca 2 7B from Microsoft, which manages to outperform several much larger models with up to 70B parameters.
8. Luma Labs improvements
I first showcased Luma Labs earlier this month.
Now their Genie has been upgraded to provide better quality generations and allow for more control via negative prompts and seeds.
You can sign up for free to test it out.
🛠️ AI tools
If you’re building custom GPTs, today’s tool will let you organize them better.
9. ChatGPT Link Shortener
The ChatGPT Link Shortener from Dub.co helps you manage link settings and track analytics for your custom GPTs or ChatGPT conversations.
When creating your short link, you get to customize a whole range of settings:
From then on, you can request an optional QR code for your short link and track the clicks it receives through a user-friendly dashboard:
Dub.co is free to use for up to 1,000 link clicks per month.
AI tip
I’ve hit the limit with all the news this week, so we’ll have to skip the tip this time.
🤦♂️ 10. AI (human, technically) fail of the week
“Taxi diver”? Now this is my kind of typo!
Have you checked out Musavir.ai ?
RE bard: definitely limited on the videos as of right now, to those that already have a transcript.