"polyaural" array processing for automatic speech recognition in degraded environments.

作者： Evandro B. Gouvêa , Richard M. Stern , Govindarajan Thattai

DOI:

关键词:

摘要: In this paper we present a new method of signal processing for robust speech recognition using multiple microphones. The method, loosely based on the human binaural hearing system, consists passing signals detected by microphones through bandpass filtering and nonlinear halfwave rectification operations, then cross-correlating outputs from each channel within frequency band. These operations provide rejection off-axis interfering signals. are repeated (in non-physiological fashion) negative signal, an estimate desired is obtained combining positive outputs. We demonstrate that use approach provides substantially better accuracy than delay-and-sum beamforming same sensors target in presence additive broadband maskers. Improvements reverberant environments tangible but more modest.

uni-trier.de 本地加速

isca-speech.org 本地加速

sourceforge.net PDF 下载加速

参考文章(12)

H. STEVEN COLBURN, NATHANIEL I. DURLACH, Chapter 11 – MODELS OF BINAURAL INTERACTION Hearing. pp. 467- 518 ,(1978) , 10.1016/B978-0-12-161904-6.50018-X

H. Steven Colburn, Abhijit Kulkarni, Models of Sound Localization Springer, New York, NY. pp. 272- 316 ,(2005) , 10.1007/0-387-28863-5_8

Richard M. Stern, Constantine Trahiotis, The Role of Consistency of Interaural Timing Over Frequency in Binaural Lateralization Auditory Physiology and Perception#R##N#Proceedings of the 9th International Symposium on Hearing Held in Carcens, France, on 9–14 June 1991. pp. 547- 554 ,(1992) , 10.1016/B978-0-08-041847-6.50067-8

Nicoleta Roman, DeLiang Wang, Guy J. Brown, Speech segregation based on sound localization The Journal of the Acoustical Society of America. ,vol. 114, pp. 2236- 2252 ,(2003) , 10.1121/1.1610463

Kalle J. Palomäki, Guy J. Brown, DeLiang Wang, A binaural processor for missing data speech recognition in the presence of noise and small-room reverberation Speech Communication. ,vol. 43, pp. 361- 378 ,(2004) , 10.1016/J.SPECOM.2004.03.005

Jont B. Allen, David A. Berkley, Image method for efficiently simulating small‐room acoustics Journal of the Acoustical Society of America. ,vol. 65, pp. 943- 950 ,(1976) , 10.1121/1.382599

M.L. Seltzer, B. Raj, R.M. Stern, Likelihood-maximizing beamforming for robust hands-free speech recognition IEEE Transactions on Speech and Audio Processing. ,vol. 12, pp. 489- 498 ,(2004) , 10.1109/TSA.2004.832988

J. L. Flanagan, J. D. Johnston, R. Zahn, G. W. Elko, Computer-steered microphone arrays for sound transduction in large rooms Journal of the Acoustical Society of America. ,vol. 78, pp. 1508- 1518 ,(1985) , 10.1121/1.2022858

N. Roman, DeLiang Wang, Binaural tracking of multiple moving sources international conference on acoustics, speech, and signal processing. ,vol. 5, pp. 149- 152 ,(2003) , 10.1109/ICASSP.2003.1199890

10.

M.L. Seltzer, R.M. Stern, Subband Likelihood-Maximizing Beamforming for Speech Recognition in Reverberant Environments IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 14, pp. 2109- 2121 ,(2006) , 10.1109/TASL.2006.872614

"polyaural" array processing for automatic speech recognition in degraded environments.

来源期刊

我的账户

"polyaural" array processing for automatic speech recognition in degraded environments.

来源期刊

相似文章 10

我的账户