It's surprising that some lesser-known tools outperformed big names like Pika Labs. Not sure image to video AI tools are ready to replace traditional animation yet, but these are getting pretty good.
Yeah I'm honestly shocked by how consistently poorly Pika Labs performs in my tests. And Vidu + PixVerse were definitely pleasant surprises.
Looking at the progress in this space (Creepy Will Smith eating spaghetti last year, realistic videos this year), I think we'll get to Midjourney-level of photorealism in the AI video space by next year. But time will tell.
They are the current leaders in text-to-speech with lots of additional tools (especially if you pay) like voice cloning, voice isolator, sound effects, etc. They offer 32 languages out of the box, so I guess it depends on which languages you're after!
It makes me wonder if the original surrealist prompt confused the two that made the train run backwards. After all, that's pretty surreal! Same with the sort of ghost train moving from the back to the front.
The kid inside me (like 95% of who I am) is blown away by these incredible things.
Could be, although I normally find that text prompts have a rather minor effect when a starting image is used. (In fact, they're pretty much always optional, and you can just upload an image and ask for a video without further context.)
But yeah, I've now been writing about GenAI for 2 years and while I'm a bit more used to seeing these insane advances, it still feels magical how much AI can do now, in so many areas.
People complain a lot about how they don't do things right or whatever, but does anyone actually realize that 3 years ago, we didn't even have real language models that didn't completely suck? Like, it's positively breathtaking if you zoom out any, at all... at least for someone Gen X or older, anyway.
I might also be using one or two new tools myself; let's see. Either way, the stuff we're already using is going to continue to improve; this is worse than they will ever be at any point in the future.
Thanks for the introduction. But some of them have heavy censorship, which are crazy annoying. AI generated content industry is still at early developing stage, and they already tell us what NOT to do.
I think some of the censorship is less about AI companies trying to impose their own set of ethics and more about attempts to get ahead of the reasonable backlash about celebrity deepfakes, non-consensual porn images, etc. that will arise. Also, they don't want to risk getting sued, so it's natural for them to err on the side of caution.
What a fantastic review! Good stuff.
Thanks, I'm happy you found it useful!
What a fantastic review!
It's surprising that some lesser-known tools outperformed big names like Pika Labs. Not sure image to video AI tools are ready to replace traditional animation yet, but these are getting pretty good.
Yeah I'm honestly shocked by how consistently poorly Pika Labs performs in my tests. And Vidu + PixVerse were definitely pleasant surprises.
Looking at the progress in this space (Creepy Will Smith eating spaghetti last year, realistic videos this year), I think we'll get to Midjourney-level of photorealism in the AI video space by next year. But time will tell.
Glad you enjoyed the round-up!
Daniel always finds cutting edge tools, thanks for sharing!
I would like to know if there is a natural multi-language Text2Speech tool at present?
You got it, glad you enjoyed it!
As for text-to-speech, the obvious starting point is ElevenLabs: https://elevenlabs.io/languages
They are the current leaders in text-to-speech with lots of additional tools (especially if you pay) like voice cloning, voice isolator, sound effects, etc. They offer 32 languages out of the box, so I guess it depends on which languages you're after!
Heard of it before! I'll give it a try! Thanks again for your patience!
It makes me wonder if the original surrealist prompt confused the two that made the train run backwards. After all, that's pretty surreal! Same with the sort of ghost train moving from the back to the front.
The kid inside me (like 95% of who I am) is blown away by these incredible things.
Could be, although I normally find that text prompts have a rather minor effect when a starting image is used. (In fact, they're pretty much always optional, and you can just upload an image and ask for a video without further context.)
But yeah, I've now been writing about GenAI for 2 years and while I'm a bit more used to seeing these insane advances, it still feels magical how much AI can do now, in so many areas.
People complain a lot about how they don't do things right or whatever, but does anyone actually realize that 3 years ago, we didn't even have real language models that didn't completely suck? Like, it's positively breathtaking if you zoom out any, at all... at least for someone Gen X or older, anyway.
100%. And the pace of progress is still strong in many areas. So it'll be exciting to see where we are a year from now.
I will read all about it here!
I might also be using one or two new tools myself; let's see. Either way, the stuff we're already using is going to continue to improve; this is worse than they will ever be at any point in the future.
Thanks for the introduction. But some of them have heavy censorship, which are crazy annoying. AI generated content industry is still at early developing stage, and they already tell us what NOT to do.
I think some of the censorship is less about AI companies trying to impose their own set of ethics and more about attempts to get ahead of the reasonable backlash about celebrity deepfakes, non-consensual porn images, etc. that will arise. Also, they don't want to risk getting sued, so it's natural for them to err on the side of caution.