Should recognizers have ears

作者: Hynek Hermansky

DOI: 10.1016/S0167-6393(98)00027-2

关键词:

摘要: Abstract Recently, techniques motivated by human auditory perception are being applied in main-stream speech technology and there seems to be renewed interest implementing more knowledge of communication into a design recognizer. The paper discusses the author's experience with applying automatic recognition speech. It advances notion that reason for such engineering should ability suppress some parts irrelevant information message argues against blind implementation scattered accidental which may task. following three properties discussed detail: • limited spectral resolution, use from about syllable-length segments, ignore corrupted or components It shows referring published works selective knowledge, optimized on cases derived real data, can consistent current stochastic approaches ASR could yield advantages practical applications.

参考文章(82)
Kenneth N. Stevens, Phonetic Features and Lexical Access Recent Research Towards Advanced Man-Machine Interface Through Spoken Language. pp. 267- 281 ,(1996) , 10.1016/B978-044481607-8/50068-2
Hans-Wilhelm Rühl, Hans-Günter Hirsch, Peter Meyer, Improved speech recognition using high-pass filtering of subband envelopes. conference of the international speech communication association. ,(1991)
Kenneth N. Stevens, Applying phonetic knowledge to lexical access. conference of the international speech communication association. ,(1995)
Stephanie Seneff, A joint synchrony/mean-rate model of auditory speech processing Journal of Phonetics. ,vol. 16, pp. 101- 111 ,(1990) , 10.1016/S0095-4470(19)30466-8
Chin-Hui Lee, Jean-Luc Gauvain, Bayesian Adaptive Learning and Map Estimation of HMM Springer, Boston, MA. pp. 83- 107 ,(1996) , 10.1007/978-1-4613-1367-0_4
Sangita Tibrewala, Hynek Hermansky, Multi-band and adaptation approaches to robust speech recognition. conference of the international speech communication association. ,(1997)
Hynek Hermansky, Misha Pavel, Takayuki Arai, Noboru Kanedera, On the importance of various modulation frequencies for speech recognition. conference of the international speech communication association. ,(1997)
Hynek Hermansky, Sarel van Vuuren, Data-driven design of RASTA-like filters. conference of the international speech communication association. ,(1997)
Hynek Hermansky, Phil Kohn, Nelson Morgan, Aruna Bayya, Compensation for the effect of the communication channel in auditory-like analysis of speech (RASTA-PLP). conference of the international speech communication association. ,(1991)