Progress diversity and inclusion in voice ai with speech delta -angle

In June 2022, Amazon Re: Mars, the company’s personal event, exploring progress and practical applications in machine learning, automation, robotics and space (Mars), place in Las Vegas. The event gathered thought leaders and technical experts who build the future of artificial intelligence and machine learning, and included keynote speakers, innovations spotlights and a series of breakout session conversations.

Now, in our re: Mars Revisited Series, Amazon Science looks back on some of the main notes and breakout session interviews from the conference. We have asked presenters three questions about their conversations and delivered the full video of their presentation.

On June 24, Ewa Kolczyk, Senior Software Development Manager with Amazon Web Services (AWS), and Kayoko Yanagisawa, senior speech scientist in Alexa, presented their speech, “Advance Diversity and Inclusion in Voice AI with Speech Anglement”. Their presentation focused on speech -ditranglement, and how Amazon uses this technique to influence various aspects of speech – tone, formulation, intonation, expressiveness and accent – to create unique Alexa response.

What was the central theme of your presentation?

In this presentation, we talked about how we use Machine Learning (ML) Text-to-Tale (TTS) techniques to improve diversity, justice and inclusion (DEI), to make Alexa’s response work optimally for everyone. We use speech -dollar techniques to separate the various aspects of speech, such as language, accent, age, gender and emotions so we can change them to create voices that speak more languages or accents, or create new voices in any sex, age or accent. We also talked about Alexa’s preferred speech speed feature and whisper mode that help customers with different needs.

In what applications do you expect this work to have the greatest influence?

Customers with speech products such as Voice AI (Alexa), IVR (Amazon Connect) or Amazon Polly users will be able to easily improve their portfolio with a wide range of TTS voices that speak different accents or languages, different speaker properties (gender, age) or different styles that fit their global boundaries.

What are the most important points you hope the audience takes away from your speech?

We can use ML techniques to change different aspects of speech and to improve the diversity and style of TTS voices, thereby meeting the needs of different customers.

https://www.youtube.com/watch?v=xtvuohkppus

Amazon Re: March 2022: Advance Diversity and Inclusion In Voice Ai with Tale Delta Angle

Leave a Comment Cancel reply