Amazon’s papers on IntereSpech 2023, sorted by research topic.
Automatic speech recognition
In the metric driven approach to compliance layer pruning for effective ASR inference
Dhanush Bekal, Karthik Gopalakrishnan, Karel Mundnich, Srikanth Ronanki, Sravan Bodapati, Katrin Kirchhof
Conmer: Streaming observance without self -tension for interactive voice assistants
Martin Radfar, Paulina Lyskawa, Brandon Trujillo, Yi Xie, Kai Zhen, Jahn Heymann, Denis Filimonov, Grant Strimel, Nathan Susanj, Athanasios Mouchtaris
DCTX-Conform: Dynamic context transmission for a low latency Total streaming and non-streaming compliance
Goeric Huybrechts, Srikanth Ronanki, Xilai Li, Hadis Nosrati, Sravan Bodapati, Katrin Kirchhof
Distillation strategies for discriminatory speech recognition Rescoring
Prashant Gurunath Shivakumar, Jari Kolehmainen, Yi Gu, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko
Effective training of attention -based contextual bias adapters with synthetic sound to personal ASR
Frozing Naowarat, Philip Harding, Pasquale d’Alterio, Sibo Tong, Bashar Awwad Shiekh Hasan
Improving human transcription quality
Jian Gao, Hanbo Sun, Cheng Cao, Zheng you
To learn when to trust which teacher for slightly monitored ASR
Aakriti Agrawal, Milind Rao, Anit Kumar Sahu, Gopinath (Nath) Chennupati, Andreas Stolcke
Model-slot-triggered bias to domain expansion in neural transducer ASR models
Edie Lu, Philip Harding, Kanthashree Mysore Sathyend, Sibo Tong, Xuandi Fu, Jing Liu, Feng-Ju (Claire) Chang, Simon Wiesler, Grant Strimel
Multi-View Frequency attention Alternative to CNN fronts to Automatic Speech Recognition
Belen Alastruey Lasheras, Lukas Drude, Jahn Heymann, Simon Wiesler
Multilingual contextual adapters to enhance customized word recognition in language with low resource
Devang Kulshreshtha, Saket Dingliwal, Brady Houston, Sravan Bodapati
Patcorrect: Non-auto-gressive phonem enlarged transformation to ASR error correction
Ziji Zhang, Zhehui Wang, Raj Kamma, Sharanya Eswaran, Narayanan Sadagopan
Personalization to BERT-based discriminatory speech recognition Rescoring
Jari Kolehmainen, Yi Gu, Aditya Gourav, Prashanth Gurunath Shivakumar, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko
Predictable Personal ASR for Reducing Latens in Stem Massists
Andreas Schwarz, Di He, Maarten Van Segbroeck, Mohammed Hethnawi, Ariya Rastrow
Registration for modeling devices in ASR Transcripts
TIANYU HUANG, CHUNG HOON HONG, Carl Wivagg, Kanna Shimizu
Scaling laws for discriminatory speech recognition rescoring models
Yi Gu, Prashanth Gurunath Shivakumar, Jari Kolehmainen, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko
Selective bias with trie-based contextual adapters for personal speech recognition neural transducers
Philip Harding, Sibo Tong, Simon Wiesler
Streaming Speech-to-Confusion Network Speech Recognition
Denis Filimonov, Prabhat Pandey, Ariya Rastrow, Ankur Gandhe, Andreas Stolcke
Data presentation
Self-Under-Supervision: Accenting of Speech Presentations Via Remaining Adapters
Anshu Bhatia, Sanchit Sinha, Did Dingliwal, Karthik Gopalakrishnan, Sravan Bodapati, Katrin Kirchhhof
Management Dialogue
Make resource-resistant parameter dialog state tracking by fast tuning
Mingyu derek ma, jiun-yu kao, shuyang gao, arpit gupta, di jin, tagyoung chung, violet money
Graphy-to-Foneme conversion
Improving conversion of graphic-to-phone by learning statements from voice recordings
Sam Ribeiro, Giulia Comini, Jaime Lorenzo Trueba
The spotting of the keyword
On-Device Restricted Self-Supplied Speech Presentation Learning to Key Word Potting Via Knowledge Character
Gene-Ping Yang, Yue Gu, Qingming Tang, Dongsu you, Yuzong Liu
Natural-language understanding
Quantity-Miking and Tensor-Compressed Transformers to Natural Language Understanding
Zi Yang, Samridhi Choudhary, Siegfried Kunzmann, Zheng Zhang
Sampling Bias in NLU Models: impact and mitigation
Zefei Li, Anil Ramakrishna, Anna Rumshish, Andy Rosenbaum, Saleh Soltan, Rahul Gupta
Understanding of disturbed phrases Underprint abstract means representation
Angus Addlesee, Marco Damonte
Paraling
Against paralinguistic-kun speech presentations for end-to-end speech-feeling recognition
George Ioannides, Michael Owen, Andrew Fletcher, Viktor Rozgic, Chao Wang
Utility-preserving privacy-enabled voice deposits for emotion detection
Chandrashekhar Lavania, Sanjiv Das, Xin Huang, Kyu he
Answering questions
Question context adjustment and response context-dependents for effective choice of weaker bed choices
Minh van Nguyen, Kishan KC, Toan Nguyen, Thien Nguyen, Ankit Chadha, Thuy Vu
Speaker diarning
Lexical Speaker Error Correction: Utilization of Language Models for Speaker Diarization Failing Correction
Rohit Paturi, Sundayan Sndararajan, Xiang Li
Speech translation
Knowledge Working on the Joint Tasks End-Tond Speech Translation
Khandokar Md. Nayem, Ran Xue, Ching-Yun (Frannie) Chang, Akshaya Vishnu Kudlu Shanbhogue
Text-to-speech
Comparison of normalization streams and diffusion models to prosody and acoustic modeling in text-to-speech
Guangyang Zhang, Tom Merritt, Sam Ribeiro, Biel Tura Vecino, Kayoko Yanagisawa, Kamil Pokora, Abdelhamid Ezzerg, Sebastian Cyert, Ammar Abbas, Piotr Bilinski, Roberto Barra-Chicote, Daniel Korzekwa, Jaime Lorenzo Trubaski,
Cross -language prosodi transfer to expressive machine dubbing
Jakub Swiatkowski, Duo Wang, Mikolaj Babianski, Patrick Tobing, Ravi Chander Vipperla, Vincent Pollet
Diffusion -based accent modeling in speech synthesis
Kamil already, Georgi Tinchev, Marta Czarnowska, Marius Cotescu, Jasha Droppo
ECAT: An end-to-end model for multital TTS & many-to-many fine-grained prosody transfer
Ammar Abbas, Sri Karlapati, Bastian Schnell, Penny Karanasou, Marcel Granero Moya, Friend Nagaraj, Ayman Boustati, Nicole Senrelt, Alexis Moinet, Thomas Drugman
Expressive Machine Dubbing through Transferring across Sentence Sticking Language
Jakub Swiatkowski, Duo Wang, Mikolaj Babianski, Giuseppe Coccia, Patrick Tobing, Ravi Chander Vipperla, Viocheslav Klimkov, Vincent Pollet
Multilingual context-based pronunciation learning to text-to-speech
Giulia Comini, Sam Ribeiro, Fan Yang, Heereen Shim, Jaime Lorenzo Trueba