Washed Out “The Hardest Part” – Made with OpenAI’s Sora

Read More

#audio, #videos

How to make music with AI using Udio

There’s something quite alluring about trying to create art in a form you’re less familiar with. AI music is the latest canvas in this space.

While we can easily sketch a drawing with a pen and piece of paper at home, not all of us have instruments lying around or the skills to use them.

Generative AI gets rid of those hurdles and tools like Udio, Stable Audio, Cassette AI and Suno allow us to dip our toes into music production. Prior experience is not required. Furthermore, Udio seems to be on to something in that it is able to combine a simple user experience with pretty decent results. — Read More

#audio

Drake Uses AI Tupac and Snoop Dogg Vocals on ‘Taylor Made Freestyle

The beef between Drake and what continues to be a strong sect of the hip-hop community grows deeper. On Friday night (April 19), the rapper released a song on his social media entitled “Taylor Made Freestyle,” which uses AI vocals from Tupac Shakur and Snoop Dogg on a stopgap between diss records as he awaits Kendrick Lamar’s reply to his freshly released “Push Ups.”Read More

#audio

200+ Artists Urge Tech Platforms: Stop Devaluing Music

STOP DEVALUING MUSIC. An open letter signed by over 200 musicians calls on AI developers, tech companies, platforms and digital music services to stop using AI to “infringe upon and devalue the rights of human artists.”  — Read More

#audio, #vfx

OpenAI built a voice cloning tool, but you can’t use it… yet

As deepfakes proliferate, OpenAI is refining the tech used to clone voices — but the company insists it’s doing so responsibly.

Today marks the preview debut of OpenAI’s Voice Engine, an expansion of the company’s existing text-to-speech API. Under development for about two years, Voice Engine allows users to upload any 15-second voice sample to generate a synthetic copy of that voice. But there’s no date for public availability yet, giving the company time to respond to how the model is used and abused.

“We want to make sure that everyone feels good about how it’s being deployed — that we understand the landscape of where this tech is dangerous and we have mitigations in place for that,” Jeff Harris, a member of the product staff at OpenAI, told TechCrunch in an interview. — Read More

#audio

Mikey Shulman: Suno and the Sound of AI Music

Read More

#audio, #videos

If you thought Sora was impressive now watch it with AI generated sound from ElevenLabs

Artificial intelligence speech startup ElevenLabs offered an insight into what its planning to release in the future, adding sound effects to AI generated video for the first time.

Best known for its near human-like text-to-speech and synthetic voice services, ElevenLabs added artificially generated sound effects to videos produced using OpenAI’s Sora.

OpenAI unveiled its impressive Sora text-to-video artificial intelligence model last week, showcasing some of the most realistic, consistent and longest AI generated video to date. — Read More

#audio, #vfx

AI Sound Effects

We were blown away by the Sora announcement but felt it needed something… What if you could describe a sound and generate it with AI? — Read More

#audio

TikTok can generate AI songs, but it probably shouldn’t

TikTok has launched many songs that have gone viral over the years, but now it’s testing a feature that lets more people exercise their songwriting skills… with some help from AI.

AI Song generates songs from text prompts with help from the large language model Bloom. Users can write out lyrics on the text field when making a post. TikTok will then recommend AI Song to add sounds to the post, and they can toggle the song’s genre.   – Read More

#audio

OpenVoice: Versatile Instant Voice Cloning

We introduce OpenVoice, a versatile voice cloning approach that requires only a short audio clip from the reference speaker to replicate their voice and generate speech in multiple languages. OpenVoice represents a significant advancement in addressing the following open challenges in the field: 1) Flexible Voice Style Control. OpenVoice enables granular control over voice styles, including emotion, accent, rhythm, pauses, and intonation, in addition to replicating the tone color of the reference speaker. The voice styles are not directly copied from and constrained by the style of the reference speaker. Previous approaches lacked the ability to flexibly manipulate voice styles after cloning. 2) Zero-Shot Cross-Lingual Voice Cloning. OpenVoice achieves zero-shot cross-lingual voice cloning for languages not included in the massive-speaker training set. Unlike previous approaches, which typically require extensive massive-speaker multi-lingual (MSML) dataset for all languages, OpenVoice can clone voices into a new language without any massive-speaker training data for that language. OpenVoice is also computationally efficient, costing tens of times less than commercially available APIs that offer even inferior performance. To foster further research in the field, we have made the source code and trained model publicly accessible. We also provide qualitative results in our demo website. Prior to its public release, our internal version of OpenVoice was used tens of millions of times by users worldwide between May and October 2023, serving as the backend of MyShell.  – Read More

#nlp, #audio