Channel selection measures for multi-microphone speech recognition

作者: Martin Wolf , Climent Nadeu

DOI: 10.1016/J.SPECOM.2013.09.015

关键词:

摘要: Automatic speech recognition in a room with distant microphones is strongly affected by noise and reverberation. In scenarios where the signal captured several arbitrarily located degree of distortion differs from one channel to another. this work we deal measures extracted given distorted that either estimate its quality or measure how well it fits acoustic models system. We then apply them solve problem selecting (i.e. channel) presumably leads lowest error rate. New selection techniques are presented, compared experimentally reverberant environments other approaches reported literature. Significant improvements rate observed for most measures. A new based on variance intensity envelope shows good trade-off between accuracy, latency computational cost. Also, combination allows further improvement

参考文章(20)
John McDonough, Matthias Woelfel, Distant Speech Recognition ,(2009)
Christian Fügen, Matthias Wölfel, Shajith Ikbal, John W. McDonough, Multi-Source Far-Distance Microphone Selection and Combination for Automatic Transcription of Lectures conference of the international speech communication association. ,(2006)
Climent Nadeu Camprubí, Martin Wolf, On the potential of channel selection for recognition of reverberated speech with multiple microphones conference of the international speech communication association. pp. 574- 577 ,(2010)
Y. Shimizu, S. Kajita, K. Takeda, F. Itakura, Speech recognition based on space diversity using distributed multi-microphone international conference on acoustics, speech, and signal processing. ,vol. 3, pp. 1747- 1750 ,(2000) , 10.1109/ICASSP.2000.862090
A. Stolcke, Britta Wrede, R. Dhillon, C. Wooters, E. Shriberg, B. Peskin, A. Janin, J. Edwards, J. Ang, N. Morgan, J. Marcias-Guarasa, S. Bhagat, The ICSI Meeting Project: Resources and Research international conference on acoustics speech and signal processing. ,(2004)
Matthias Wölfel, Channel selection by class separability measures for automatic transcriptions on distant microphones. conference of the international speech communication association. pp. 582- 585 ,(2007)
Yasunari Obuchi, Noise robust speech recognition using delta-cepstrum normalization and channel selection Electronics and Communications in Japan Part Ii-electronics. ,vol. 89, pp. 9- 20 ,(2006) , 10.1002/ECJB.20281
Angel de la Torre, Jose C. Segura, Carmen Benitez, Antonio M Peinado, Antonio J. Rubio, Non-linear transformations of the feature space for robust Speech Recognition international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 401- 404 ,(2002) , 10.1109/ICASSP.2002.5743739
R. A. FISHER, THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS Annals of Human Genetics. ,vol. 7, pp. 179- 188 ,(1936) , 10.1111/J.1469-1809.1936.TB02137.X
Hui Jiang, Confidence measures for speech recognition: A survey Speech Communication. ,vol. 45, pp. 455- 470 ,(2005) , 10.1016/J.SPECOM.2004.12.004