Amazon’s Alex is a voice assistant that could sound kinda familiar. The company announced on Wednesday during its annual re: MARS conference, that it is working on an artificial intelligence innovation to its Alexa system, this will allow the device to mimic any voice, even a dead family member. In a video Amazon (AMZN) demonstrated how Alexa’s signature voice read a story to a young child, it was his grandmother’s voice.
The senior vice president, Rohit Prasad said the updated system will be able to collect enough voice data from less than a minute of audio to make personalization like this possible, instead of having someone spend hours in a recording studio like how it’s done in the past. Prasad did expatiate on when this feature could be launched. Though the comment has been declined by Amazon.
The concept came from Amazon looking at new ways to add more ” human attributes” to the device, especially in these times of the ongoing pandemic, when so many of us have either one or two of love ones Prasad said. ” while I can’t take away the pain of a lost love one, this can as well make memories last.
Amazon has long used recognizable voices such as the real voices of Samuel L Jackson, Melissa, Mc Carthy and Shaquille O’ Neal, to voice Alexa. But AI recreations of people’s voices have also improved a lot over the years, particularly with the use of Al and deep fake technology like that of the Anthony Bourdain documentary three lines of Roadrunner were generated by Al, even though it sounded like they were said by the late media personality.
This case raised a stir due to it was not made clear in the movie that the dialogue was Al generated and had not been approved by the Bourdain estate. The director Mr Morgan Neville said there’s a possibility of a documentary ethics panel later. Recently Actor Val Kilmer, who lost his voice to throat cancer, partnered with startup Sonatic to create an AI-driven speaking voice for him in the new “TopGun Maverick film. The company uses archival audio footage of Kilmer to teach an algorithm how to speak like the actor according to variety.
The Senior Analyst at IDC Research said he sees the value in Amazon’s effort made. Amazon seems interested in doing this because they have the capability and technology, and they are always researching ways to promote the smart assistant and smart home experience. ” With deeper connection with Alexa, or just become a skill that some folks dabble with from time to time remains to be seen.
Amazon’s personalised Alexa voices may struggle moat with the uncanny valley effect- recreating a voice that is familiar to love ones but isn’t quite right, which leads to rejections by real humans.
HOW ALEXA WORKS
If you have a recording of the person whose voice you want to imitate, Alexa needs only a minute to record it and modulate it in the same way. “State of the art text speech (TTS) systems require several hours of recorded speech data to generate high-quality synthetic speech, ” explained Amazon. Using a reduced amount of training data, standards TTS models suffer from speech quality and intelligibility degradation, making training low-resources TTS systems problems.
A novel was proposed with an extremely low resource TTS method called voice filter that uses as little as one minute of speech from a target speaker. This uses voice conversion (VC) as a port processing module to a pre-existing high-quality TTS system and Marks a conceptual shift in the existing TTS paradigm, framing the few shots TTS problem as a VC task.” Though there are concerns and issues because this tool can also be used to copy the voice of any person and impersonate them which could be very dangerous, and also suffer innocent people through voice impersonation acts and Amazon is reluctant to roll it out globally of time being.