10X AI (Issue #36): Google Galore, Dubbing Studio by ElevenLabs, and a Failed Crossover
PLUS: Etsy's Gift Mode, Midjourney updates, OpenAI updates, Adept Fuyu-Heavy, and more AI voices.
Happy Sunday, friends!
Welcome back to 10X AI: a weekly look at generative AI news, tools, and tips for the average user. Today, I bring you more “AI Voices” with people’s take on AI.
Let’s get to it.
🗞️ AI news
Here are this week’s AI developments.
1. Google’s many AI things
There was enough Google-related stuff this week to fill an entire 10X AI issue. Let’s try to run through it:
a) Google injects AI into Chrome
It was just a matter of time.
Google is bringing three generative AI features into its Chrome browser:
Tab Organizer should finally solve the “too many open tabs” problem we’ve all faced. It uses AI to automatically group open tabs, like so:
AI-generated themes will let you create unique Chrome themes on the fly by selecting the desired mood and color.
Help Me Write…helps you write. Right-clicking on a text box or text field on any site will bring up an AI assistant that helps craft your message.
The above should be available to US users who turn on “Experimental AI” features in Chrome. (“Help Me Write” is expected to launch in February.)
b) New education features with some AI sprinkled in
Just a week after Microsoft had its big “AI in education” play, Google is announcing new education features of its own. Some of them involve AI:
Duet AI lets educators call on generative AI within Google products like Docs, Slides, and Sheets.
AI-powered practice sets can help teachers create interactive lessons and more.
AI-suggested questions (coming soon) will recommend questions for educators to add to YouTube videos assigned as learning materials.
c) Partnership with Hugging Face
Google also announced a collaboration with Hugging Face to advance the use of open models and open-source technologies. Among other initiatives, the vast library of Hugging Face open-source models will be available for deployment within Google Cloud.
d) New text-to-video model teaser
Finally, the company is also cooking up a new text-to-video model called Lumiere:
Lumiere comes with a range of features. We’ve seen many of them in other video models, but some appear unique:
Text-to-video (0:17 above): The standard prompt-to-video generation we’re accustomed to by now.
Image-to-video (0:26 above): Animate a starting image using a text prompt.
Stylized generation (0:45 above): Generate videos in the visual style of a given image.
Cinemagraphs (1:13 above): Google’s feature that works like Runway’s “Motion Brush,” letting you select specific areas of an image to animate.
Video editing (1:21 above): Video inpainting, a la Pika Labs 1.0 or Adobe Project Fast Fill.
Judging by the demo clips, Lumiere is the most realistic video model we’ve seen so far, leaps and bounds above what Google showcased a year ago and ahead of the current crop of video models.
The only question is: When will we get to try the model for ourselves?
Since this is Google Research, your guess is as good as mine.
2. ElevenLabs rolls out Dubbing Studio
ElevenLabs has been offering AI dubbing since last October.
But that was a standalone tool. Now, the company has launched a full-fledged Dubbing Studio that can handle 29 languages.
The Dubbing Studio gives creators much more control over the entire project, from editing transcripts to tweaking translations to injecting new audio and more. Here:
While AI dubbing is available for free, you’ll need at least the “Starter” paid plan to access the Dubbing Studio.
3. Etsy’s “Gift Mode” uses AI to help you find gifts
Etsy has a new ambition: To “become the destination for gifting.”
To achieve that goal, the company just announced Gift Mode™, which uses a combination of human curation and AI voodoo to surface the most relevant products on its site.
You start by selecting the person you’re shopping for, the occasion, and up to three of their interests. Then…something something AI shenanigans…and poof: The perfect gifts appear.
Here’s a demo that’s oozing with energy and enthusiasm:
Convinced?
The Gift Mode™ should already be available online and via the Etsy app.
4. Midjourney unveiled several cool things
Midjourney Founder David Holz announced a bunch of updates on Discord:
Let’s break it down:
a) V6 gets access to more tools
You can now use Pan, Zoom Out, and Vary (Region) with your V6 images.
I showcased the “Zoom Out” feature in the Midjourney V5.2 article.
But to test it out, I took our infamous “smiling old man” V6 image from before…
…and then zoomed out 4x to reveal the space around him…
Pan works in a similar way but instead of zooming out equally on all sides it simply adds a chunk to one side of the image, depending on which direction you choose to expand:
As for Vary (Region), I wrote an entire article about it. (And now you can also use it on your V6 images.)
b) The Alpha site is available to the 5K Club
Rejoice!
The Alpha site is no longer exclusively for lunatics with almost 17K generated images:
Now, lesser lunatics with “only” 5K images can also ditch Discord.
If you’re one of them, head to alpha.midjourney.com to check it out.
c) Feed ideas to Midjourney via “/feedback”
The Midjourney team just added a new /feedback command that brings up this screen:
Here, you can describe what changes or new features you’d like to see and rate other people’s ideas at the bottom.
5. OpenAI news
OpenAI was also busy this week, with a slew of announcements that are mostly interesting for developers:
Two new text embedding models
A “less lazy” GPT-4 Turbo (hopefully this will make its way into ChatGPT Plus)
A cheaper GPT-3.5 Turbo
A new moderation model to help developers identify harmful input
If you’re a ChatGPT Plus user, there’s also a treat waiting for you: Pressing “@” in a chat lets you drag any of your custom GPTs directly into the conversation:
So now you can unleash the combined power of various specialized GPTs in a single thread instead of having to start new standalone chats.
6. Adept Fuyu-Heavy: New kid on the multimodal block
Adept—who brought us Fuyu-8B in October—just announced a new model.
It’s called Adept Fuyu-Heavy, and it’s currently the third-best multimodal model behind only GPT4-V and Gemini Ultra (despite being over 10X smaller).
Here’s a quick showcase of its multimodal capabilities:
There doesn’t seem to be a public demo yet, but keep an eye out.
🗣️ AI voices
This is the second time I run the “AI voices” segment.
I want to expose my readers to a broader range of perspectives than just my own. “AI voices” is all about showcasing how different people use AI and learning from their experiences.
If you have any burning questions you’d like the wider Substack community to answer, this is also your chance to provide input for future “voices” editions.
(Vote in the poll at the end or leave a comment to let me know what you think.)
7. Charlie Guo is sure it’s not too late to dive into AI
Here’s
:“With so much going on in AI, it can be easy to think you're behind or feel like it's too late to get involved. But that couldn't be further from the truth! We are so, so, so early when it comes to LLMs and generative AI - while it might seem like things are ‘figured out,’ nearly everyone is figuring things out as they go.
If you're interested in this stuff, the best thing to do is to start experimenting. If you're technical, do a coding tutorial. If you're non-technical, test different prompts or products. Try things for yourself. Form your own opinions. I promise you won't regret it.”
Check out Charlie’s Substack:
8. Michael Woudenberg says AI can help fiction writers
Here’s
:“When I wrote my sci-fi adventure Paradox, AI helped. I already had my outline, ideas, and themes. Then I used Grammarly, WordTune, and Bing (now Microsoft Copilot) for editing and research. AI was also useful for coming up with character names, crafting scene descriptions, and working out kinks in the narrative.
Some suggest using AI as a co-author, but I believe this doesn't rely enough on the uniqueness of human creativity. The truth is, AI is a terrible author. But it did help me get off the ‘blank page’ problem. Having something, even bad, can get you moving. Instead of outsourcing my creativity, I supercharged it with tools that allowed me to find the truly creative elements.”
Check out Michael’s Substack:
9. Nat shares a few AI observations and tips
Here’s
:“When I want to write about a complex topic, I ask GPT-4 to pose difficult questions about it. This checks my knowledge and lets GPT-4 evaluate my answers objectively. As an Airbnb Superhost, I’ve fine-tuned an AI model to complement my creative side with data analysis. I also developed a custom GPT model to compose chess puzzle descriptions.
Here are two tips: When using AI, try asking it to ‘Put yourself in my place’ when you discuss an important aspect of your business or personal life. If you’re unsatisfied with the output, you can jokingly supplement with: ‘Stop acting like a lazy Garfield that everybody complains about.’ Have fun with it, and you'll find that GPT-4 will do its best.”
Check out Nat’s Substack:
🤦♂️ 10. AI fail of the week
Note to self: Never ask for a scene of Batman fighting Darth Vader
Sunday poll time
The last “AI voices” poll was polluted by bot traffic (or a technical glitch). It got over 40 votes within the first 1 minute of the article going live. Also, now that I’ve featured “AI voices” twice, I’d love your input.
AI voices section is so important, rising tide lifts all the boats and all that. Also I've just switched to Arc browser but Google tempts me back 🤌
Just FYI, I'm using Chrome right now, and the drop-down from the upper left doesn't offer the option to organize the tabs yet.
I like the AI voices section, and I think it has legs! Sure, I know and like all of the folks you included, but I also think it adds a touch of spice to the newsletter.