13 Comments
Sep 12Liked by Daniel Nest

What a fantastic review! Good stuff.

Expand full comment
author

Thanks, I'm happy you found it useful!

Expand full comment

What a fantastic review!

It's surprising that some lesser-known tools outperformed big names like Pika Labs. Not sure image to video AI tools are ready to replace traditional animation yet, but these are getting pretty good.

Expand full comment
author

Yeah I'm honestly shocked by how consistently poorly Pika Labs performs in my tests. And Vidu + PixVerse were definitely pleasant surprises.

Looking at the progress in this space (Creepy Will Smith eating spaghetti last year, realistic videos this year), I think we'll get to Midjourney-level of photorealism in the AI video space by next year. But time will tell.

Glad you enjoyed the round-up!

Expand full comment
3 hrs agoLiked by Daniel Nest

Daniel always finds cutting edge tools, thanks for sharing!

I would like to know if there is a natural multi-language Text2Speech tool at present?

Expand full comment
author

You got it, glad you enjoyed it!

As for text-to-speech, the obvious starting point is ElevenLabs: https://elevenlabs.io/languages

They are the current leaders in text-to-speech with lots of additional tools (especially if you pay) like voice cloning, voice isolator, sound effects, etc. They offer 32 languages out of the box, so I guess it depends on which languages you're after!

Expand full comment

It makes me wonder if the original surrealist prompt confused the two that made the train run backwards. After all, that's pretty surreal! Same with the sort of ghost train moving from the back to the front.

The kid inside me (like 95% of who I am) is blown away by these incredible things.

Expand full comment
author

Could be, although I normally find that text prompts have a rather minor effect when a starting image is used. (In fact, they're pretty much always optional, and you can just upload an image and ask for a video without further context.)

But yeah, I've now been writing about GenAI for 2 years and while I'm a bit more used to seeing these insane advances, it still feels magical how much AI can do now, in so many areas.

Expand full comment

People complain a lot about how they don't do things right or whatever, but does anyone actually realize that 3 years ago, we didn't even have real language models that didn't completely suck? Like, it's positively breathtaking if you zoom out any, at all... at least for someone Gen X or older, anyway.

Expand full comment
author

100%. And the pace of progress is still strong in many areas. So it'll be exciting to see where we are a year from now.

Expand full comment

I will read all about it here!

I might also be using one or two new tools myself; let's see. Either way, the stuff we're already using is going to continue to improve; this is worse than they will ever be at any point in the future.

Expand full comment
Sep 12Liked by Daniel Nest

Thanks for the introduction. But some of them have heavy censorship, which are crazy annoying. AI generated content industry is still at early developing stage, and they already tell us what NOT to do.

Expand full comment
author

I think some of the censorship is less about AI companies trying to impose their own set of ethics and more about attempts to get ahead of the reasonable backlash about celebrity deepfakes, non-consensual porn images, etc. that will arise. Also, they don't want to risk getting sued, so it's natural for them to err on the side of caution.

Expand full comment