Bridging the Gap Between Clean Data Training and Real-World Inference for Spoken Language Understanding.

作者: Dacheng Tao , Liang Ding , Di Wu , Yiren Chen

DOI:

关键词:

摘要: Spoken language understanding (SLU) system usually consists of various pipeline components, where each component heavily relies on the results its upstream ones. For example, Intent detection (ID), and slot filling (SF) require automatic speech recognition (ASR) to transform voice into text. In this case, perturbations, e.g. ASR errors, environmental noise careless user speaking, will propagate ID SF models, thus deteriorating performance. Therefore, well-performing models are expected be resistant some extent. However, existing trained clean data, which causes a \textit{gap between data training real-world inference.} To bridge gap, we propose method from perspective domain adaptation, by both high- low-quality samples embedding similar vector space. Meanwhile, design denoising generation model reduce impact samples. Experiments widely-used dataset, i.e. Snips, large scale in-house dataset (10 million examples) demonstrate that not only outperforms baseline (noisy) corpus but also enhances robustness, is, it produces high-quality under noisy environment. The source code released.

参考文章(23)
Yoshua Bengio, Xavier Glorot, Understanding the difficulty of training deep feedforward neural networks international conference on artificial intelligence and statistics. pp. 249- 256 ,(2010)
James Allen, Natural Language Understanding ,(1987)
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, Victor Lempitsky, Domain-Adversarial Training of Neural Networks Domain Adaptation in Computer Vision Applications. ,vol. 17, pp. 189- 209 ,(2017) , 10.1007/978-3-319-58347-1_10
Thang Luong, Hieu Pham, Christopher D. Manning, Effective Approaches to Attention-based Neural Machine Translation empirical methods in natural language processing. pp. 1412- 1421 ,(2015) , 10.18653/V1/D15-1166
Gregoire Mesnil, Yann Dauphin, Kaisheng Yao, Yoshua Bengio, Li Deng, Dilek Hakkani-Tur, Xiaodong He, Larry Heck, Gokhan Tur, Dong Yu, Geoffrey Zweig, Using recurrent neural networks for slot filling in spoken language understanding IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 23, pp. 530- 539 ,(2015) , 10.1109/TASLP.2014.2383614
Ye-Yi Wang, Li Deng, A. Acero, Spoken language understanding IEEE Signal Processing Magazine. ,vol. 22, pp. 16- 31 ,(2005) , 10.1109/MSP.2005.1511821
Alex J. Smola, Markus Weimer, Lihong Li, Martin Zinkevich, Parallelized Stochastic Gradient Descent neural information processing systems. ,vol. 23, pp. 2595- 2603 ,(2010)
Chih-Wen Goo, Guang Gao, Yun-Kai Hsu, Chih-Li Huo, Tsung-Chieh Chen, Keng-Wei Hsu, Yun-Nung Chen, Slot-gated modeling for joint slot filling and intent prediction north american chapter of the association for computational linguistics. ,vol. 2, pp. 753- 757 ,(2018) , 10.18653/V1/N18-2118
Sergey Edunov, Myle Ott, Michael Auli, David Grangier, Understanding Back-Translation at Scale. empirical methods in natural language processing. pp. 489- 500 ,(2018) , 10.18653/V1/D18-1045
Yuchen Zhang, Tianle Liu, Mingsheng Long, Michael Jordan, None, Bridging Theory and Algorithm for Domain Adaptation international conference on machine learning. pp. 7404- 7413 ,(2019)