Binaural Scene Analysis with Multidimensional Statistical Filters

作者: C. Spille , B. T. Meyer , M. Dietz , V. Hohmann

DOI: 10.1007/978-3-642-37762-4_6

关键词:

摘要: The segregation of concurrent speakers and other sound sources is an important aspect in improving the performance audio technology, such as noise reduction automatic speech recognition, ASR, difficult acoustic conditions. This technology relevant for applications like hearing aids, mobile devices, robotics, hands-free communication speech-based computer interfaces. Computational auditory-scene analysis (CASA) techniques simulate aspects processing properties human perceptual system using statistical signal-processing to improve inferences about causes input received by system. study argues that CASA a promising approach achieve source separation outlines several theoretical arguments support this hypothesis. With focus on computational binaural scene analysis, principles are reviewed. Furthermore, experimental approach, applicability recent model interaction ASR multi-speaker conditions with spatially separated moving explored. provides inference filter employs priori information possible movements order track positions speakers. tracks used adapt beamformer selects specific speaker. output subsequently task. Compared unprocessed, is, mixed, data two-speaker condition, word recognition rates obtained enhanced signals based were increased from 30.8 88.4 %, demonstrating potential proposed CASA-based approach.

参考文章(71)
P. L. Søndergaard, P. Majdak, The Auditory Modeling Toolbox Springer Berlin Heidelberg. pp. 33- 56 ,(2013) , 10.1007/978-3-642-37762-4_2
Steve Young, Gunnar Evermann, Mark Gales, Thomas Hain, Dan Kershaw, Xunying Liu, Gareth Moore, Julian Odell, Dave Ollason, Dan Povey, Valtcho Valtchev, Phil Woodland, The HTK book Cambridge University Engineering Department and Entrophic Cambridge Research Laboratory. ,(1995)
Arnaud Doucet, Nando Freitas, Neil Gordon, An Introduction to Sequential Monte Carlo Methods Sequential Monte Carlo Methods in Practice. pp. 3- 14 ,(2001) , 10.1007/978-1-4757-3437-9_1
Pejman Mowlaee, Dorothea Kolossa, Rahim Saeidi, Steffen Zeiler, Alberto Abad, Ram ´ on, Fernandez Astudillo, Rainer Martin, Silva Neto, CHiME Challenge: Approaches to Robustness using Beamforming and Uncertainty-of-Observation Techniques ,(2011)
David McAlpine, Dan Jiang, Alan R. Palmer, A neural code for low-frequency sound localization in mammals Nature Neuroscience. ,vol. 4, pp. 396- 401 ,(2001) , 10.1038/86049
R. Lyon, A computational model of binaural localization and separation international conference on acoustics, speech, and signal processing. ,vol. 8, pp. 1148- 1151 ,(1983) , 10.1109/ICASSP.1983.1171927
Mathias Dietz, Stephan D. Ewert, Volker Hohmann, Auditory model based direction estimation of concurrent speakers from binaural signals Speech Communication. ,vol. 53, pp. 592- 605 ,(2011) , 10.1016/J.SPECOM.2010.05.006
O. Voss, Zur Theorie des Hörens Naturwissenschaften. ,vol. 21, pp. 721- 721 ,(1933) , 10.1007/BF01504519