Binaural processing for robust recognition of degraded speech

作者: Anjali Menon , Chanwoo Kim , Umpei Kurokawa , Richard M. Stern

DOI: 10.1109/ASRU.2017.8268912

关键词:

摘要: This paper discusses a new combination of techniques that help in improving the accuracy speech recognition adverse conditions using two microphones. Classic approaches toward binaural processing use some form cross-correlation over time across sensors to effectively isolate target from interferers. Several additional temporal and spatial masking have been proposed past improve presence reverberation interfering talkers. In this paper, we consider frequency limited range channels addition existing methods monaural processing. has effect locating reinforcing coincident peaks representation interaction provides local smoothing specified frequencies. Combined with mentioned above, leads significant improvements recognition.

参考文章(26)
H. STEVEN COLBURN, NATHANIEL I. DURLACH, Chapter 11 – MODELS OF BINAURAL INTERACTION Hearing. pp. 467- 518 ,(1978) , 10.1016/B978-0-12-161904-6.50018-X
Evandro B. Gouvêa, Richard M. Stern, Govindarajan Thattai, "polyaural" array processing for automatic speech recognition in degraded environments. conference of the international speech communication association. pp. 926- 929 ,(2007)
Chanwoo Kim, Richard M. Stern, Nonlinear Enhancement of Onset for Robust Speech Recognition conference of the international speech communication association. pp. 2058- 2061 ,(2010)
Ruth Y Litovsky, H Steven Colburn, William A Yost, Sandra J Guzman, The Precedence Effect Springer, New York, NY. pp. 85- 105 ,(1987) , 10.1007/978-1-4612-4738-8_4
H. Steven Colburn, Abhijit Kulkarni, Models of Sound Localization Springer, New York, NY. pp. 272- 316 ,(2005) , 10.1007/0-387-28863-5_8
Richard M. Stern, Constantine Trahiotis, The Role of Consistency of Interaural Timing Over Frequency in Binaural Lateralization Auditory Physiology and Perception#R##N#Proceedings of the 9th International Symposium on Hearing Held in Carcens, France, on 9–14 June 1991. pp. 547- 554 ,(1992) , 10.1016/B978-0-08-041847-6.50067-8
Echo suppression in a computational model of the precedence effect workshop on applications of signal processing to audio and acoustics. pp. 4- ,(1997) , 10.1109/ASPAA.1997.625622
Hans Wallach, Edwin B. Newman, Mark R. Rosenzweig, The Precedence Effect in Sound Localization (Tutorial Reprint) Journal of The Audio Engineering Society. ,vol. 21, pp. 817- 826 ,(1973)
B.R. Glasberg, B.C.J. Moore, A revision of Zwicker's loudness model Acustica. ,vol. 82, pp. 335- 345 ,(1996)