A quick guide to Amazon's 40+ papers on Emnlp 2023

Natural-Language and NLU) have long been a key focus of the papers that Amazon scientists publish at the conference on empirical methods for natural-langue treatment (EMNLP), but at this year’s conference that starts today, Amazon’s NLU research shows interest in utilizing the power of large language models (LLMS). The answer of the question also remains an active research topic, while the reformulation of queries and text collection appears as new concentration areas.

Automatic speech recognition

Adabert-CTC: Utilization of BERT-CTC for only domain adaptation in ASR
Tyler Vuong, Karel Mundnich, Dhanush Bekal, Veera Raghavndra Elluru, Srikanth Ronanki, Sravan Bodapati

Continuous learning

Coordinated Replay -Sam Selection to Continuous Federed Learning
Jack Good, Jimit Majmudar, Christophe Dupuy, Jixuan Wang, Charith Peris, Clement Chung, Richard Zemel, Rahul Gupta

Data extraction

Insightnet: Structured insight mining from customer feedback
Sandeep Mukku, Manan Soni, Chetan Aggarwal, Jitenkumar Rana, Promod Yenigalla, Rashmi Patange, Shyam Mohan

Knowledge-Elective pre-recovery to extraction of allocation of allocation
Hui Liu, Qingyu Yin, Zhengyang Wang, Chenwei Zhang, Haoming Jiang, Yifan Gao, Zheng Li, Xian Li, Chenwei Zhang, Bing Yin, William Wang, Xiodan Zhu

Selection of data

Influence results in scale for effective sampling of language data
Nikhil Anand, Joshua Tan, Maria Minakova

Understanding Document

A multimodal multilingual benchmark for document image classification
Yoshinari Fujinuma, Siddharth Varia, Nishant Sankaran, Bonan Min, Srikar Appalaraju, Yogarshi Vyas

Semantic matching for text classification with complex class descriptions
Brian de Silva, Kuan-Wen Huang, Gwang Lee, Karen Hovsepian, Yan Xu, Mingwei Shen

Embodied the completion of the task

Multimodal embodied plan predicts increased with synthetic bodily dialogue
Aishwarya Padmakumar, Mert Inan, Spandana Gella, Patrick Lange, Dilek Hakkani-Tür

Device Linking

MREFEFEED: An effective multilingual device to end-to-end multilingual device
Peerat Limkonchotiwat, Weiwei Cheng, Christos Christodoulopoulos, Amir Saffari, Jens Lehmann

Get-shot learning

Automated Few-Shot Classification With Instruction-Language Models
Rami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Wilson

Information collection

Deep Metric Learning to Hierarchalely Rank – an application in product embrace
Kee Kiat Koo, Ashutosh Joshi, Nishaanth Reddy, Ismail Tutar, Vaclav Petrick, Changhe Yuan, Karim Bouyarmane

KD-BOOST: Boosting Real-Time Semantic Matching in E-Handel With Knowledge Condition
Sanjay Agrawal, Vivek Simian, Ankith MS

Multi-Teacher Distillation to multilingual spelling correction
Jingfen Zhang, Xuan Guo, Sravan Bodapati, Christopher Potts

Setting Instruction

Cesar: Automatic induction of composition instructions for dialog box with multiple turns
Taha Aksu, Devamanyu Hazarica, Shikib Mehri, Seokhwan Kim, Dilek Hakkani-Tür, Yang Liu, Mahdi Namazifar

Llm hallucination

Invite: A test bed of automatically generated invalid questions to evaluate large language models for hallucinations
Anil Ramakrishna, Rahul Gupta, Jens Lehmann, Morteza Ziyadi

Machine learning

Effective Long -Frange Transformers: You have to wait more but not require
Qingru Zhang, Dhananjay Ram, Cole Hawkins, Sheng Zha, Tuo Zhao

Natural language treatment

NAMEGUESS: Column Name extension to Table Data
Jiani Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Shen Wang, Huzefa Rangwala, George Karypis

Natural-language understanding

Adversarial Robustness for large nerve models using separation and word attributes
Xiaomeng Jin, Bhanu Vinzamuri, Sriram Venkatapathy, Heng Ji, Pradeep Natarajan

Measurement and mitigation of dialog-to-depth restrictions violations of learning in context learning
Shufan Wang, Sebastien Jean, Sailik Sengupta, James Gung, Nikolaos Pappas, Yi Zhang

Overview of the pre-income of an intentional codes. Given an utterance, X₁From the prior corpus, Amazon -scientists generate a pseudo -intention name, Y₁^PseudoUsing labels from intention-role-labeling (IRL) tags. The model is then optimized by pulling Gold Udrerance X₁^GoldThe gold intention Y₁and the pseudo -intention, Y₁^PseudoClose to input the utterance, X₁in the embedding room. From “for-threatening intention-noticeable coders to zero and few-shot intention classification”.

MULTICONS V2: A large multilingual data set for fine -grained and noisy named device recognition
Besnik Fetahu, Zhiyu Chen, Sudipta Kar, Oleg Rokhlenko, Shervin Malmasi

Pre-Riting Inten-Aware Close to zero and few-shot-inter-classification
Mujeen Sung, James Gung, Elman Mansimov, Nikolaos Pappas, Raphael Shu, Salvatore Romeo, Yi Zhang, Vittorio Castelli

Personalization

Personalized close retrieval to the Global Index for Voice-enabled Interviews
Masha Belyi, Charlotte Dzialo, Chaitanya Dwivedi, Prajit Reddy Muppidi, Kanna Shimizu

Download and Copy: Scaling ASR Personalization to Large Catalogs
Sai Muralidhar Jayanthi, Devang Kulshreshtha, Did Dingliwal, Srikanth Ronanki, Sravan Bodapati

Inquiry reformulation

CL-QR: Cross-language improved query reformulation for multi-language conversation AIA agents
Zhongkai Sun, Zhengyang Zhao, Sixing Lu, Chengyuan Ma, Xiaohu Liu, Xing Fan, Wei (Sawyer) Shen, Chenlei (Edward) Guo

Graph meets LLM: A new approach to collaborative filtering for robust conversation understanding
Zheng Chen, Ziyan Jiang, Fan Yang, Eunah Cho, Xing Fan, Xiaojiang Huang, Yanbin Lu, Aram Galstyan

Improving contextual inquiry Rewrite to Conversation AI Agents through User Preference Feedback Learning
Zhongkai Sun, Yingxue Zhou, Jie Hao, Xing Fan, Yanbin Lu, Chengyuan Ma, Wei (Sawyer) Shen, Chenlei (Edward) Guo

Questions Answer Databases

Protection: Fast-based different questions generation from web articles
Vinayak Puranik, Anirban Majumder, Vineet Chaoji

Quadro: Datasets and models for questions about answering database collection
Stefano Campese, Ivano Lauriola, Alessandro Moschitti

Answering questions

Strong and effective base lines for open questions about open domains. Reply
Andrei C. Coman, Gianni Barlacchi, Adrià de Gispert

Tokenization conscious means something for general models about extraction of NLP tasks
KAISER SUN, MONEY QI, YUHAO ZHANG, LAN LIU, WILLIAM YANG WANG, ZHIHENG HUANG

Too much of Product Information: Don’t worry, let’s look for evidence!
Aryan Jain, Jitenkumar Rana, Chetan Aggarwal

Reasoning

Plan, Verifer and Switch: Integrated Reasoning with Different X-of-Thoughts
Tengxiao liu, qipeng guo, yuqing yang, xiangkun hu, yue zhang, xipeng qiu, zheng zhang

AI Responsible

Geographical deletion in language generation
Pola Schwöbel, Jacek Golebiowski, Michele Donini, Cérric Archambeau, Danish pruthi

Speech translation

End-to-end one-channel speaker rotation aware of conversation Speech translation
Juan Pablo Zuluaga Gomez, Zhaocheng Huang, Xing Niu, Rohit Paturi, Sunday Srinivasan, Prashant Mathur, Brian Thompson, Marcello Federico

Summary text

Abstractity of Summary Models Through Calibrated Distillation
Hwanjun Song, Igor Shalyminov, Hang SU, Siffi Singh, Kaisheng Yao, Saab Mansour

Generating Summary with controllable readability levels
Leonardo Ribeiro, Mohit Bansal, Markus Dreyer

Improving the consistency for summary of text with energy functions
Qi Zeng, Qingyu Yin, Zheng Li, Yifan Gao, Sreyashi Nag, Zhengyang Wang, Bing Yin, Heng Ji, Chao Zhang

Instructions: Instruction-Tuning LLMS for merging product title
Besnik Fetahu, Zhiyu Chen, Oleg Rokhlenko, Shervin Malmasi

Evaluation of Multiple Document Summary in Presence of Harmful Content
Avshalom Manevich, David Carmel, Nachshon Cohen, Elad Kravi, Ori Shapira

Re -examination of summary evaluation across criteria for more qualities
Ornst, Ori Shapira, Ido Dagan, Ran Levy

Subject modeling

Detime: diffusion-enhanced subject modeling using decoder-decoder-based LLM
Weijie Xu, Wenxiang Hu, Fanyou Wu, Srinivasan Sengamedu, “SHS”

A quick guide to Amazon’s 40+ papers on Emnlp 2023

Automatic speech recognition

Adabert-CTC: Utilization of BERT-CTC for only domain adaptation in ASR
Tyler Vuong, Karel Mundnich, Dhanush Bekal, Veera Raghavndra Elluru, Srikanth Ronanki, Sravan Bodapati

Continuous learning

Coordinated Replay -Sam Selection to Continuous Federed Learning
Jack Good, Jimit Majmudar, Christophe Dupuy, Jixuan Wang, Charith Peris, Clement Chung, Richard Zemel, Rahul Gupta

Selection of data

Influence results in scale for effective sampling of language data
Nikhil Anand, Joshua Tan, Maria Minakova

Embodied the completion of the task

Multimodal embodied plan predicts increased with synthetic bodily dialogue
Aishwarya Padmakumar, Mert Inan, Spandana Gella, Patrick Lange, Dilek Hakkani-Tür

Device Linking

MREFEFEED: An effective multilingual device to end-to-end multilingual device
Peerat Limkonchotiwat, Weiwei Cheng, Christos Christodoulopoulos, Amir Saffari, Jens Lehmann

Get-shot learning

Automated Few-Shot Classification With Instruction-Language Models
Rami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Wilson

Setting Instruction

Cesar: Automatic induction of composition instructions for dialog box with multiple turns
Taha Aksu, Devamanyu Hazarica, Shikib Mehri, Seokhwan Kim, Dilek Hakkani-Tür, Yang Liu, Mahdi Namazifar

Llm hallucination

Invite: A test bed of automatically generated invalid questions to evaluate large language models for hallucinations
Anil Ramakrishna, Rahul Gupta, Jens Lehmann, Morteza Ziyadi

Machine learning

Effective Long -Frange Transformers: You have to wait more but not require
Qingru Zhang, Dhananjay Ram, Cole Hawkins, Sheng Zha, Tuo Zhao

Natural language treatment

NAMEGUESS: Column Name extension to Table Data
Jiani Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Shen Wang, Huzefa Rangwala, George Karypis

Questions Answer Databases

Protection: Fast-based different questions generation from web articles
Vinayak Puranik, Anirban Majumder, Vineet Chaoji

Quadro: Datasets and models for questions about answering database collection
Stefano Campese, Ivano Lauriola, Alessandro Moschitti

Reasoning

Plan, Verifer and Switch: Integrated Reasoning with Different X-of-Thoughts
Tengxiao liu, qipeng guo, yuqing yang, xiangkun hu, yue zhang, xipeng qiu, zheng zhang

AI Responsible

Geographical deletion in language generation
Pola Schwöbel, Jacek Golebiowski, Michele Donini, Cérric Archambeau, Danish pruthi

Speech translation

End-to-end one-channel speaker rotation aware of conversation Speech translation
Juan Pablo Zuluaga Gomez, Zhaocheng Huang, Xing Niu, Rohit Paturi, Sunday Srinivasan, Prashant Mathur, Brian Thompson, Marcello Federico

Subject modeling

Detime: diffusion-enhanced subject modeling using decoder-decoder-based LLM
Weijie Xu, Wenxiang Hu, Fanyou Wu, Srinivasan Sengamedu, “SHS”

Leave a Comment Cancel reply

Automatic speech recognition Adabert-CTC: Utilization of BERT-CTC for only domain adaptation in ASRTyler Vuong, Karel Mundnich, Dhanush Bekal, Veera Raghavndra Elluru, Srikanth Ronanki, Sravan Bodapati

Continuous learning Coordinated Replay -Sam Selection to Continuous Federed LearningJack Good, Jimit Majmudar, Christophe Dupuy, Jixuan Wang, Charith Peris, Clement Chung, Richard Zemel, Rahul Gupta

Selection of data Influence results in scale for effective sampling of language dataNikhil Anand, Joshua Tan, Maria Minakova

Embodied the completion of the task Multimodal embodied plan predicts increased with synthetic bodily dialogueAishwarya Padmakumar, Mert Inan, Spandana Gella, Patrick Lange, Dilek Hakkani-Tür

Device Linking MREFEFEED: An effective multilingual device to end-to-end multilingual devicePeerat Limkonchotiwat, Weiwei Cheng, Christos Christodoulopoulos, Amir Saffari, Jens Lehmann

Get-shot learning Automated Few-Shot Classification With Instruction-Language ModelsRami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Wilson

Setting Instruction Cesar: Automatic induction of composition instructions for dialog box with multiple turnsTaha Aksu, Devamanyu Hazarica, Shikib Mehri, Seokhwan Kim, Dilek Hakkani-Tür, Yang Liu, Mahdi Namazifar

Llm hallucination Invite: A test bed of automatically generated invalid questions to evaluate large language models for hallucinationsAnil Ramakrishna, Rahul Gupta, Jens Lehmann, Morteza Ziyadi

Machine learning Effective Long -Frange Transformers: You have to wait more but not requireQingru Zhang, Dhananjay Ram, Cole Hawkins, Sheng Zha, Tuo Zhao

Natural language treatment NAMEGUESS: Column Name extension to Table DataJiani Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Shen Wang, Huzefa Rangwala, George Karypis

Questions Answer Databases Protection: Fast-based different questions generation from web articlesVinayak Puranik, Anirban Majumder, Vineet Chaoji Quadro: Datasets and models for questions about answering database collectionStefano Campese, Ivano Lauriola, Alessandro Moschitti

Reasoning Plan, Verifer and Switch: Integrated Reasoning with Different X-of-ThoughtsTengxiao liu, qipeng guo, yuqing yang, xiangkun hu, yue zhang, xipeng qiu, zheng zhang

AI Responsible Geographical deletion in language generationPola Schwöbel, Jacek Golebiowski, Michele Donini, Cérric Archambeau, Danish pruthi

Speech translation End-to-end one-channel speaker rotation aware of conversation Speech translationJuan Pablo Zuluaga Gomez, Zhaocheng Huang, Xing Niu, Rohit Paturi, Sunday Srinivasan, Prashant Mathur, Brian Thompson, Marcello Federico

Subject modeling Detime: diffusion-enhanced subject modeling using decoder-decoder-based LLMWeijie Xu, Wenxiang Hu, Fanyou Wu, Srinivasan Sengamedu, “SHS”

Leave a Comment Cancel reply

Automatic speech recognition

Adabert-CTC: Utilization of BERT-CTC for only domain adaptation in ASR
Tyler Vuong, Karel Mundnich, Dhanush Bekal, Veera Raghavndra Elluru, Srikanth Ronanki, Sravan Bodapati

Continuous learning

Coordinated Replay -Sam Selection to Continuous Federed Learning
Jack Good, Jimit Majmudar, Christophe Dupuy, Jixuan Wang, Charith Peris, Clement Chung, Richard Zemel, Rahul Gupta

Selection of data

Influence results in scale for effective sampling of language data
Nikhil Anand, Joshua Tan, Maria Minakova

Embodied the completion of the task

Multimodal embodied plan predicts increased with synthetic bodily dialogue
Aishwarya Padmakumar, Mert Inan, Spandana Gella, Patrick Lange, Dilek Hakkani-Tür

Device Linking

MREFEFEED: An effective multilingual device to end-to-end multilingual device
Peerat Limkonchotiwat, Weiwei Cheng, Christos Christodoulopoulos, Amir Saffari, Jens Lehmann

Get-shot learning

Automated Few-Shot Classification With Instruction-Language Models
Rami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Wilson

Setting Instruction

Cesar: Automatic induction of composition instructions for dialog box with multiple turns
Taha Aksu, Devamanyu Hazarica, Shikib Mehri, Seokhwan Kim, Dilek Hakkani-Tür, Yang Liu, Mahdi Namazifar

Llm hallucination

Invite: A test bed of automatically generated invalid questions to evaluate large language models for hallucinations
Anil Ramakrishna, Rahul Gupta, Jens Lehmann, Morteza Ziyadi

Machine learning

Effective Long -Frange Transformers: You have to wait more but not require
Qingru Zhang, Dhananjay Ram, Cole Hawkins, Sheng Zha, Tuo Zhao

Natural language treatment

NAMEGUESS: Column Name extension to Table Data
Jiani Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Shen Wang, Huzefa Rangwala, George Karypis

Questions Answer Databases

Protection: Fast-based different questions generation from web articles
Vinayak Puranik, Anirban Majumder, Vineet Chaoji

Quadro: Datasets and models for questions about answering database collection
Stefano Campese, Ivano Lauriola, Alessandro Moschitti

Reasoning

Plan, Verifer and Switch: Integrated Reasoning with Different X-of-Thoughts
Tengxiao liu, qipeng guo, yuqing yang, xiangkun hu, yue zhang, xipeng qiu, zheng zhang

AI Responsible

Geographical deletion in language generation
Pola Schwöbel, Jacek Golebiowski, Michele Donini, Cérric Archambeau, Danish pruthi

Speech translation

End-to-end one-channel speaker rotation aware of conversation Speech translation
Juan Pablo Zuluaga Gomez, Zhaocheng Huang, Xing Niu, Rohit Paturi, Sunday Srinivasan, Prashant Mathur, Brian Thompson, Marcello Federico

Subject modeling

Detime: diffusion-enhanced subject modeling using decoder-decoder-based LLM
Weijie Xu, Wenxiang Hu, Fanyou Wu, Srinivasan Sengamedu, “SHS”