Sunday Rundown #65: Visual AI Galore & Kung Fu Fly-ting

Sunday Bonus #25: Using Gemini uncensored in Google AI Studio.

Aug 25, 2024

∙ Paid

Reminder: Midjourney workshop on Friday

My first-ever live workshop for paid subscribers—hosted by Charlie Guo of Artificial Ignorance—is on August 30 at 11 AM PST.

If you’re a paid subscriber to Why Try AI, see event details here.

If you’re a paid subscriber to Artificial Ignorance, see event details here.

Happy Sunday, friends!

Welcome back to the weekly look at generative AI that covers the following:

Sunday Rundown (free): this week’s AI news + a fun AI fail.
Sunday Bonus (paid): a goodie for my paid subscribers.

All Sunday Bonuses In One Place

Let’s get to it.

🗞️ AI news

Here are this week’s AI developments.

We got a whole lot of new image and video models to play with.

👩‍💻 AI releases

New stuff you can try right now:

AI21 Labs released the Jamba 1.5 family of models that are fast, efficient, and boast the longest context window among open models (256K tokens).
Anthropic gave Claude the ability to display mathematical expressions using LaTeX.
D-ID released the AI Video Translate tool with automatic bulk translation into multiple languages, voice cloning, and lip sync.
ElevenLabs slashed the price of its Turbo models, so they’re 50% cheaper to use.
Google added a minor Gmail feature that lets Gemini polish your draft with one click.
Hotshot is a new text-to-video model capable of making short clips from text prompts. You get 2 free credits per day to try it out.
Ideogram launched a significantly improved Ideogram 2.0 model that also offers style presets and built-in color palette control. (I recently showed how you can control the color palette using style references in Midjourney.)
The holistic AI film-making platform LTX Studio is finally available to everyone and boasts five new features that give you more precise control over video creation.
Luma Labs released Dream Machine 1.5 with better prompt adherence, text rendering inside video clips, and higher-quality video in general.
Microsoft released three new additions to its Phi line of small language models:
1. Phi-3.5-MoE: a mixture-of-experts model that outperforms similarly-sized competitors.
2. Phi-3.5-mini: an upgrade to Phi-3-mini.
3. Phi-3.5-vision: a vision model with “cutting-edge capabilities for multi-frame image understanding and reasoning.”
Midjourney users can finally skip Discord and start using the website directly. Also, the 25-generations free trial is temporarily back, so now’s a great time to take Midjourney for a spin!
OpenAI now lets developers fine-tune GPT-4o for their use cases.

🔬 AI research

Cool stuff you might get to try one day:

Freepik is testing the alpha version of its new text-to-image model called Mystic (by the team behind the Magnific AI upscaler). First glimpses from current testers look impressive.
Perplexity is rolling out Code Interpreter that can install libraries and render charts in your search results on the fly. (Somewhat similar to the ChatGPT version.)

📖 AI resources

Helpful stuff that teaches you about AI:

Google AI Studio Prompt Gallery - a collection of ready-made prompts you can immediately start using with Gemini models inside the AI Studio.

🔀 AI random

🤦‍♂️ AI fail of the week

Nailed that kung fu entrance, Kling! So realistic.

(The process—and slightly more convincing results—here.)

Anything to share?

Sadly, Substack doesn’t allow free subscribers to comment on posts with paid sections, but I am always open to your feedback. You can message me here:

💰 Sunday Bonus #25: How to remove Gemini censorship in Google AI Studio

I recently mentioned that you might sometimes want an uncensored LLM.

The good news is that you don’t have to scour the Dark Web or sell your soul to Sam Altman to do this.

There’s an easy way to use Google’s Gemini models with far fewer guardrails directly inside the official Google AI Studio.

Let me show you the two-step process.

Why Try AI