2 Comments
Feb 3, 2023Liked by Daniel Nest

Great post! The text prompts seemed like they certainly have a strong degree of "interpretation" (just like image AI) but I found the melody conditioning particularly fascinating. The examples had a wide array of outputs but the melody itself sounded pretty spot on with each one.

It's easy to imagine models like this being incorporated into gaming. Variables like health status or combat status etc. could easily be fed into the soundtrack on the fly to make things more tense for example.

Expand full comment
author

Yeah I like the potential this has for even the average person to "compose" entire musical arrangements based on some simple tune they have in their heads. You just need to hum it passably and then ask for the genre and instruments you want. You could technically even hum each separate instrumental track and mix them into a complete piece.

And the gaming use case sounds really cool, I didn't even think about that option! Could also work in an immersive cooperative game where players have to communicate via voice chat - music/atmosoheric sounds could respond to their voice intensity or even the content, etc.

Crazy times we live in. Thanks for the comment!

Expand full comment