Signal processing for robust speech recognition

作者: Fu-Hua Liu , Pedro J. Moreno , Richard M. Stern , Alejandro Acero

DOI: 10.3115/1075812.1075889

关键词:

摘要: This paper describes a series of cepstral-based compensation procedures that render the SPHINX-II system more robust with respect to acoustical environment. The first algorithm, phone-dependent cepstral compensation, is similar in concept previously-described MFCDCN method, except vectors are selected according current phonetic hypothesis, rather than on basis SNR or VQ codeword identity. We also describe two accomplish adaptation codebook for new environments, as well use reduced-bandwidth frequency analysis process telephone-bandwidth speech. Use various algorithms consort produces reduction error rates by much 40 percent relative rate achieved mean normalization alone, both development test sets and context 1993 ARPA CSR evaluations.

参考文章(39)
Stephanie Seneff, A joint synchrony/mean-rate model of auditory speech processing Journal of Phonetics. ,vol. 16, pp. 101- 111 ,(1990) , 10.1016/S0095-4470(19)30466-8
Victor W. Zue, Helen M. Meng, A comparative study of acoustic representations of speech for vowel classification using multi-layer perceptrons. conference of the international speech communication association. ,(1990)
R. Lyon, A computational model of filtering, detection, and compression in the cochlea international conference on acoustics, speech, and signal processing. ,vol. 7, pp. 1282- 1285 ,(1982) , 10.1109/ICASSP.1982.1171644
R.D. Patterson, K. Robinson, J. Holdsworth, D. McKeown, C. Zhang, M. Allerhand, Complex Sounds and Auditory Images Auditory Physiology and Perception#R##N#Proceedings of the 9th International Symposium on Hearing Held in Carcens, France, on 9–14 June 1991. pp. 429- 446 ,(1992) , 10.1016/B978-0-08-041847-6.50054-X
Patrick Mangan. Peterson, Adaptive array processing for multiple microphone hearing aids Massachusetts Institute of Technology. ,(1989)
Bodden, Modeling human sound-source localization and the cocktail-party-effect Acta Acoustica. ,vol. 1, pp. 43- 55 ,(1993)
M.J.F. Gales, S.J. Young, Cepstral parameter compensation for HMM recognition in noise Speech Communication. ,vol. 12, pp. 231- 239 ,(1993) , 10.1016/0167-6393(93)90093-Z
B.H. Juang, Speech recognition in adverse environments Computer Speech & Language. ,vol. 5, pp. 275- 294 ,(1991) , 10.1016/0885-2308(91)90011-E
George C. Johnston, Michael A. Swieboda, Loudspeaker diaphragm and method for making same The Journal of the Acoustical Society of America. ,vol. 78, pp. 2163- 2163 ,(1985) , 10.1121/1.392618