Sunday Rundown #51: OpenAI vs. Google & A Handy Extra Hand
Sunday Showdown #11: GPT-4o vs. Gemini 1.5 Pro: Who writes the best product copy?
Happy Sunday, friends!
Welcome back to the weekly look at generative AI, which covers the following:
Sunday Rundown + AI Fail (free): I share this week’s AI news and a fail for your entertainment.
Sunday Showdown + AI Tip (paid): I pit AI tools against each other and share a hands-on tip for working with AI.
On today’s “Sunday Showdown,” I’ll see how good the new GPT-4o and the not-as-new Gemini 1.5 Pro are at writing marketing copy for a silly fake product.
Let’s get to it.
🗞️ AI news
Here are this week’s AI developments.
Big week, with OpenAI demos on Monday and Google I/O on Tuesday. I’ll be splitting their announcements into the usual “available now” and “coming eventually” buckets.
👩💻 AI releases
New stuff you can try right now:
OpenAI’s announcements (watch the live stream here):
New best-in-class model called GPT-4o (“o” is for “omni”). It’s available to everyone for free. (Only the chat interface for now. Voice chat is coming later.)
New ChatGPT desktop app for macOS.
Free users can discover and try custom GPTs (previously a paid-only feature)
Google’s announcements (watch the recap here):
AI Overviews (formerly “Search Generative Experience”) are on their way out of Beta and are rolling out to all users, starting with the US.
Gemini 1.5 Pro and new productivity features rolling out to Gemini for Google Workspace users.
A lightweight, fast, and cost-efficient multimodal model called Gemini 1.5 Flash.
A new model in the Gemma family: PaliGemma.
OpenAI is also expanding ChatGPT’s data analysis features, so you can connect directly to files in Google Drive and MS OneDrive, manipulate tables in the chat itself, and more.
Anthropic has—finally!—made Claude available in Europe. (Daniel Nests of the world, rejoice!)
Claude can now prompt itself on your behalf.
ElevenLabs launched Dubbing API, letting developers incorporate audio and video translation into their apps and products.
Hume AI launced “Chatter,” an interactive news podcast you can talk to. (Try it here.)
🔬 AI research
Cool stuff you might get to try one day:
OpenAI will make the alpha version of the new Voice Mode (as shown in the demos) available to ChatGPT Plus users in “the coming weeks.”
Upcoming releases announced during Google I/O:
Ask Photos will let you ask Gemini to find photos based on complex and detailed queries.
New capabilities coming to Gemini on Android.
Gemini Advanced users will be able to make and use Gems (Google’s answer to OpenAI’s Custom GPTs).
Project Astra is a real-time AI agent/assistant who can see what you see and interact with you via voice.
Veo (text-to-video model) can generate 1080p videos that are over a minute long from a single prompt. (Sign up for the waitlist.)
Gemma 2, the next iteration of the Gemma family.
Imagen 3, the next version of Google’s text-to-image model.
📖 AI resources
Helpful stuff that teaches you about AI:
“Learn how to use Gemini” - a free course by Ben’s Bites.
🔀 AI random
Other notable AI stories of the week:
OpenAI has parnered with Reddit. ChatGPT will now be hooked into Reddit’s structured content in real time while Reddit will be able to offer AI-powered features to its users.
🤦♂️ 10. AI fail of the week
When your smartphone needs a dedicated holder (final cartoon here):
⚔️ Sunday Showdown #11 - GPT-4o vs. Gemini 1.5 Pro: Who writes the best product copy?
This isn’t the first time Gemini and GPT duke it out.
Gemini 1.5. Pro met GPT-4o’s older cousin—GPT-4 Turbo—when the two of them wrote microfiction.
Today, we have a more serious task: writing product marketing copy.
But to lighten the mood, the product itself is silly and made up.
Let’s get going!