Physiologically-Motivated Synchrony-Based Processing for Robust Automatic Speech Recognition

作者: Chanwoo Kim , Yu-Hsiang Bosco Chiu , Richard M. Stern

DOI:

关键词: Computer scienceNeurocomputational speech processingSignal processingSpeech processingMechanism (biology)Speech recognitionVoice activity detectionTransient noiseAuditory system

摘要: This paper describes the structure and performance of a new signal processing scheme, motivated by physiology peripheral auditory system, that improves speech recognition accuracy in presence broadband noise. An important attribute is novel mechanism to represent cycle-by-cycle synchrony response low-frequency auditory-nerve fibers, addition more conventional based on mean rate response. It shown use physiologically-motivated both transient noise, provides further improvement beyond which provided mechanism. Index Terms: modeling, robust recognition, snchrony.

参考文章(9)
Stephanie Seneff, A joint synchrony/mean-rate model of auditory speech processing Journal of Phonetics. ,vol. 16, pp. 101- 111 ,(1990) , 10.1016/S0095-4470(19)30466-8
Victor W. Zue, Helen M. Meng, A comparative study of acoustic representations of speech for vowel classification using multi-layer perceptrons. conference of the international speech communication association. ,(1990)
R. Lyon, A computational model of filtering, detection, and compression in the cochlea international conference on acoustics, speech, and signal processing. ,vol. 7, pp. 1282- 1285 ,(1982) , 10.1109/ICASSP.1982.1171644
Murray B. Sachs, Eric D. Young, Encoding of steady-state vowels in the auditory nerve: Representation in terms of discharge rate Journal of the Acoustical Society of America. ,vol. 66, pp. 470- 479 ,(1979) , 10.1121/1.383098
Eric D. Young, Murray B. Sachs, Representation of steady‐state vowels in the temporal aspects of the discharge patterns of populations of auditory‐nerve fibers Journal of the Acoustical Society of America. ,vol. 66, pp. 1381- 1403 ,(1979) , 10.1121/1.383532
A.M.A. Ali, J. Van der Spiegel, P. Mueller, Robust auditory-based speech processing using the average localized synchrony detection IEEE Transactions on Speech and Audio Processing. ,vol. 10, pp. 279- 292 ,(2002) , 10.1109/TSA.2002.800556
Oded Ghitza, Auditory models and human performance in tasks related to speech coding and speech recognition IEEE Transactions on Speech and Audio Processing. ,vol. 2, pp. 115- 132 ,(1994) , 10.1007/978-1-4615-2281-2_17
Doh-Suk Kim, Soo-Young Lee, R.M. Kil, Auditory processing of speech signals for robust speech recognition in real-world noisy environments IEEE Transactions on Speech and Audio Processing. ,vol. 7, pp. 55- 69 ,(1999) , 10.1109/89.736331
B. Raj, V.N. Parikh, R.M. Stern, The effects of background music on speech recognition accuracy international conference on acoustics, speech, and signal processing. ,vol. 2, pp. 851- 854 ,(1997) , 10.1109/ICASSP.1997.596069