In 2022, if you offered to show me a tool that creates videos from a single text prompt, I’d call you a witch/wizard and burn you at the stake. I keep a live bonfire running in my backyard for just such occasions.
Today, I’d just ask, “Which one of them are you talking about?”
Earlier in April, we had our first glimpse of text-to-video with ModelScope and Runway Gen-1.
Since then, things moved fast!
As I type these words, there are no fewer than six separate companies in this space…that I know of.
There are technically even more than that, but not all text-to-video sites are created equal. Let me briefly explain what I mean…
Different flavors of text-to-video
If you Google “text-to-video,” you’ll end up with a whole range of very different tools. As far as I can tell, they all fall into one of the following four categories:
1. Animated avatars
These text-to-video sites will have a lifelike AI person read out your script using text-to-speech. This is useful for training videos, presentations, and so on.
Prominent examples: Synthesia and HeyGen
2. Slideshow creators
These tools use your script or a text prompt to pull together existing stock photos and video clips into a complete video, typically with a voiceover.
Prominent examples: Pictory and InVideo
3. Frame interpolators
These cycle through slight variations of a single starting frame, creating a morphing video in the process. (I talked about this back in October 2022.)
Prominent examples: Neural Frames and Picsart
(I showed examples of #2 and #3 in this September issue of 10X AI.)
4. “True” text-to-video
These tools use AI to generate brand-new video clips from nothing but a text prompt, much like text-to-image tools do for pictures.
That’s what I’ll focus on today.
Let’s have a closer look at the six tools in this category and take each of them for a quick spin. I’ve sorted them alphabetically for reasons already explained.
This post might get cut off in some email clients. Click here to read it online.
1. Genmo
First off, we’ve got Genmo, which I already featured about a month ago.
Genmo has an intuitive website that lets you jump straight into creating videos via a prominent input field.
I’ll be using the same text prompt to test all of our text-to-video sites:
“Cartoon cat chasing a ball of yarn”
If there’s one thing humanity can agree on, it’s that cartoons, cats, and balls of yarn are the shit.
One cool thing about Genmo is that it shows you a work-in-progress snapshot as your video is being rendered, so you get a sneak peek at what awaits:
After around a minute, our 4-second video is ready:
Quite smooth!
We have our cartoon cat. We have the ball of yarn. And we have the chasing of the ball of yarn.
Note that my goal today isn’t to offer robust reviews; I’m just showcasing the tools. Having said that, I’d say Genmo nailed it.
Genmo at a glance:
Advanced features: Control video duration, aspect ratio, camera movement, and motion intensity. Add special visual effects.
Extra tools: 3D asset creator, text-to-image generator, interpolated videos (legacy)
Interface: Website
Unique feature: Work-in-progress preview
Free or paid: 100 free daily credits. Paid monthly plans start at $10.
2. FullJourney
If you ask me, FullJourney is an obvious nod to Midjourney. (Because videos are the full package, while Midjourney’s stuck halfway with only images, you see.)
And just like Midjourney, FullJourney is entirely Discord-based.
FullJourney is the only site on this list that lets you add a soundtrack to your video with a simple prompt:
Let’s have a look at the end result:
Wow.
That sure was…something.
I guess all of the requested elements are technically there. But this is not so much a coherent video as it is what you’ll see if you hook a serial killer up to an MRI.
Now, let’s all collectively ignore what can only be feline genitalia in one of the segments and move on with our lives.
FullJourney at a glance:
Advanced features: Negative prompt, using a starting image
Extra tools: Image, audio, GIF, and lipsynch generation
Interface: Discord
Unique feature: Add a soundtrack to your video
Free or paid: Unspecified amount of free starting credits. Paid monthly plans start at $30.
3. Moonvalley
Moonvalley is among the fresh arrivals in this space.
As with FullJourney, you’ll need a Discord account to use it.
Moonvalley lets you select one of five styles for your video. I went with “Anime/Manga” as it seemed fitting for our cartoon cat:
Here we go:
Sweet!
I like the cute style, and we actually have a happy end to this video: The cat gets the yarn. Forget about the multiple tails and the disappearing limbs. What matters is that the yarn has been secured, everyone!
Moonvalley at a glance:
Advanced features: Pick your style (five options), set video duration, negative prompts, and seed selection.
Extra tools: None
Interface: Discord
Unique feature: Multiple styles
Free or paid: Appears to be 100% free while in Beta
4. Morph Studio
Morph Studio is yet another newcomer (perhaps the most newcomery?)
It’s also Discord-driven:
For our test, I set the length to the current maximum of 7 seconds:
Drumroll, please…
Ah, yes! The rare Blobus Amorphus cat breed being mercifully put out of its misery by a falling boulder.
To be fair, I mistyped the video length parameter.
Let’s try again with a full 7-second clip:
The Kitty Centipede. Coming soon, to a nightmare near you.
I do like the colorful overall style though, and that’s not nothing.
Morph Studio at a glance:
Advanced features: Control the aspect ratio, amount of motion, camera movement, frame rate, and clip length. Use a starting image.
Extra tools: None
Interface: Discord
Unique feature: 1080p output quality
Free or paid: Appears to be 100% free while in Beta
5. Pika Labs
Pika Labs is one of the veterans in AI video. (That means it’s been around for months rather than weeks.)
used it with great effect to animate a few images in our recent collaborative video. I also mentioned it in the first October edition of 10X AI.As with most of these sites, Pika Labs is fully Discord-based:
Pika Labs lets you message the bot directly instead of using a public channel, which is convenient:
Let us have a look-see:
A teleporting cat channeling its inner Nightcrawler? Don’t mind if I do!
But there sure is a fair amount of psychedelic yarn chasing taking place.
Pika Labs at a glance:
Advanced features: Control the aspect ratio, motion intensity, camera movement, frame rate, and prompt guidance strength. Define negative prompt and starting seed. Use a starting image.
Extra tools: None
Interface: Discord
Unique feature: “Encrypt” images and text in your videos. (See here.)
Free or paid: 100% free during Beta, with a paid subscription in the works.
6. Runway
Last but not least, the OG of text-to-video: Runway!
I’ve mentioned it a bunch of times before. Over time, Runway has grown to include a truly impressive range of image and video tools in addition to basic text-to-video generation:
Runway also offers you a free preview of video alternatives before you commit to one, which is great for not wasting credits on useless generations.
Unlike most entries on this list (except Genmo), Runway has a web interface. No Discord needed:
Now let’s see what it does with our cat prompt:
Well, it’s easily the most polished and realistic video of today, but we asked for a cartoon, and this is very much not.
Also, instead of chasing the yarn, our kitten decided to have an existential crisis while staring blankly into space.
I’m with you, cat. I’m with you.
Runway at a glance:
Advanced features: Control motion intensity, camera movement, and starting seed. Remix an existing video based on a text prompt. Use a starting image. Frame interpolation videos.
Extra tools: Create images and 3D models/textures. Train your own text-to-image models. A suite of image and video editing tools.
Interface: Website
Unique feature: Extend a video (up to 16 seconds)
Free or paid: 125 free starting credits. Paid monthly plans start at $12.
Over to you…
There you have it!
This is nowhere close to an in-depth test that reflects the true capabilities of each tool. It’s really just a quick teaser. I definitely encourage you to check them out for yourself, especially since they’re all free to try.
Are you already using text-to-video tools? Are there any that I missed? Let me know!
You can send me an email at whytryai@substack.com or leave a comment.
They really all do have their pros and cons at the moment, but I strongly suspect everyone will just steal/assimilate all the "pros" from others, and probably pretty soon.
It really is mind-blowing that this was a complete fantasy one year ago.
Great hands-on testing here, Daniel! It's amazing the differences between some of these generators. Moonvalley was shockingly impressive, but Fulljourney... wtf. 🤣
Here's a more generalized question for you: why do these AI startups seem to fixate on Discord? Surely a dedicated web interface is less limiting and gives them more control over their interface (and opens up their audience to people over 25 who don't use Discord).
Do you think ChatGPT would've been as much of a success as a Discord bot?