Google Translate now lets you hear real-time translations in your headphones

Google is rolling out a beta experience that lets you hear real-time translations in your headphones, the company announced on Friday. The tech giant is also bringing advanced Gemini capabilities to Google Translate and expanding its language-learning tools in the Translate app.

The new real-time headphone translations experience keeps each speaker’s tone, emphasis, and cadence intact, so it’s easier to follow the conversation and tell who’s saying what, Google says. The new capability essentially turns any pair of headphones into a real-time, one-way translation device. — Read More

#audio

The No. 1 Country Song in America Is AI-Generated

According to Billboard’s “Country Digital Song Sales” chart, the No. 1 song in the U.S. is “Walk My Walk” by Breaking Rust—an artist that was created by artificial intelligence (AI).

This is a new development in the music industry as it is the first time an AI-created song has reached the top of the charts.

There have long been concerns about the use of generative AI in creative sectors. Discourse about this came into the fold a few years ago following protests in Hollywood from the writer and actor guilds, which took place shortly after the public release of ChatGPT, and concerns that came in tandem with the new technology and its implications.

… As the AI revolution continues to impact creative industries, it could be that more AI-generated artists continue to pop up in the charts, with pushback likely to be inevitable. — Read More

#audio

OpenAI reportedly developing new generative music tool

OpenAI is working on a new tool that would generate music based on text and audio prompts, according to a report in The Information.

… One source told The Information that OpenAI is working with some students from the Juilliard School to annotate scores as a way to provide training data. — Read More

#audio

The Voice Lives On: Moises Powers Whitney Houston’s Return to the Stage

Moises’ AI stem separation technology extracts Whitney Houston’s vocals from original recordings, enabling live orchestral performances across a seven-city tour

Whitney Houston’s voice moved generations, and through a collaboration between The Estate of Whitney E. Houston, Primary Wave Music, and Park Avenue Artists, it has now returned to the stage. The Voice of Whitney: A Symphonic Celebration, which debuted in August 2024, brings Houston’s legendary vocals to concert halls across US cities. The concert transports fans into Houston’s musical world, as live orchestras perform alongside Houston’s vocals and rare footage. Audiences experience the power of Houston’s voice in a live setting, with a breathtaking fusion of technology and artistry that celebrates her enduring legacy. — Read More

#audio

Voice Agent Engineering

Read More

#audio, #videos

Amazing New Technology Can ‘Bend’ Sounds Into Your Ears Only

What if you could listen to music or a podcast without headphones or earbuds and without disturbing anyone around you? Or have a private conversation in public without other people hearing you?

Our newly published research introduces a way to create audible enclaves – localized pockets of sound that are isolated from their surroundings. In other words, we’ve developed a technology that could create sound exactly where it needs to be.

The ability to send sound that becomes audible only at a specific location could transform entertainment, communication and spatial audio experiences. — Read More

#audio

Eerily realistic AI voice demo sparks amazement and discomfort online

In late 2013, the Spike Jonze film Her imagined a future where people would form emotional connections with AI voice assistants. Nearly 12 years later, that fictional premise has veered closer to reality with the release of a new conversational voice model from AI startup Sesame that has left many users both fascinated and unnerved.

“I tried the demo, and it was genuinely startling how human it felt,” wrote one Hacker News user who tested the system. “I’m almost a bit worried I will start feeling emotionally attached to a voice assistant with this level of human-like sound.”

In late February, Sesame released a demo for the company’s new Conversational Speech Model (CSM) that appears to cross over what many consider the “uncanny valley” of AI-generated speech, with some testers reporting emotional connections to the male or female voice assistant (“Miles” and “Maya”). — Read More

#audio

Generating audio for video

Video-to-audio research uses video pixels and text prompts to generate rich soundtracks

Video generation models are advancing at an incredible pace, but many current systems can only generate silent output. One of the next major steps toward bringing generated movies to life is creating soundtracks for these silent videos.

Today, we’re sharing progress on our video-to-audio (V2A) technology, which makes synchronized audiovisual generation possible. V2A combines video pixels with natural language text prompts to generate rich soundscapes for the on-screen action. — Read More

Read the Paper

#audio

OpenAI pauses use of “Sky” voice after threat of legal action.

OpenAI has paused a voice mode option for ChatGPT-4o, Sky, after backlash accusing the AI company of intentionally ripping off Scarlett Johansson’s critically acclaimed voice-acting performance in the 2013 sci-fi film Her.

In a blog defending its casting decision for Sky, OpenAI went into great detail explaining its process for choosing the individual voice options for its chatbot. But ultimately, the company seemed pressed to admit that Sky’s voice was just too similar to Johansson’s to keep using it, at least for now. — Read More

#audio, #ethics

Randy Travis’s New Song Recreates His Voice With AI Technology

Randy Travis, who lost much of his speech in a 2013 stroke, used artificial intelligence technology to clone his voice for his first recording in more than a decade.

Travis, his longtime producer Kyle Lehning, Travis’s wife Mary, and Warner Music Nashville co-chair and co-president Cris Lacy spoke with CBS Sunday Morning to detail how AI helped create “Where That Came From,” Travis’s new song that released on Friday. The full report will air Sunday. — Read More

#audio