Automatic Speech Recognition

作者: Uday Kamath , John Liu , James Whitaker , Uday Kamath , John Liu

DOI: 10.1007/978-3-030-14596-5_8

关键词:

摘要: Automatic speech recognition (ASR) has grown tremendously in recent years, with deep learning playing a key role. Simply put, ASR is the task of converting spoken language into computer readable text (Fig. 8.1). It quickly become ubiquitous today as useful way to interact technology, significantly bridging gap human–computer interaction, making it more natural.

参考文章(16)
Andrew Cameron Morris, Phil D. Green, Viktoria Maier, From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition. conference of the international speech communication association. ,(2004)
Michael Brandstein, Darren Ward, Microphone Arrays Signal Processing Techniques and Applications Springer. ,(2001)
Yedid Hoshen, Ron J. Weiss, Kevin W. Wilson, Speech acoustic modeling from raw multichannel waveforms international conference on acoustics, speech, and signal processing. pp. 4624- 4628 ,(2015) , 10.1109/ICASSP.2015.7178847
Herve A. Bourlard, Nelson Morgan, Connectionist Speech Recognition: A Hybrid Approach Kluwer Academic Publishers. ,(1993)
Mehryar Mohri, Fernando Pereira, Michael Riley, Speech Recognition with Weighted Finite-State Transducers Springer, Berlin, Heidelberg. pp. 559- 584 ,(2008) , 10.1007/978-3-540-49127-9_28
R. Schluter, I. Bezrukov, H. Wagner, H. Ney, Gammatone Features and Feature Combination for Large Vocabulary Speech Recognition 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07. ,vol. 4, pp. 649- 652 ,(2007) , 10.1109/ICASSP.2007.366996
Steve Young, A review of large-vocabulary continuous-speech IEEE Signal Processing Magazine. ,vol. 13, pp. 45- ,(1996) , 10.1109/79.536824
Hynek Hermansky, Perceptual linear predictive (PLP) analysis of speech Journal of the Acoustical Society of America. ,vol. 87, pp. 1738- 1752 ,(1990) , 10.1121/1.399423
L.R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE. ,vol. 77, pp. 267- 296 ,(1989) , 10.1109/5.18626