On the potential of channel selection for recognition of reverberated speech with multiple microphones

作者: Climent Nadeu Camprubí , Martin Wolf

DOI:

关键词:

摘要: The performance of ASR systems in a room environment with distant microphones is strongly affected by reverberation. As the degree signal distortion varies among acoustic channels (i.e. microphones), the recognition accuracy can benefit from a proper channel selection. In this paper, we experimentally show that there exists large margin for WER reduction selection, and discuss several possible methods which do not require any a-priori classification. Moreover, using a LVCSR task, significant shown simple technique uses measure computed sub-band time envelope various microphone signals.

参考文章(8)
Kevin Lohde, Rüdiger Hoffmann, Rico Petrick, Matthias Wolff, The harming part of room acoustics in automatic speech recognition. conference of the international speech communication association. pp. 1094- 1097 ,(2007)
Jaume Padrell, Climent Nadeu, Dusan Macho, Pere Pujol, Speech recognition experiments with the SPEECON database using several robust front-ends. conference of the international speech communication association. ,(2004)
Christian Fügen, Matthias Wölfel, Shajith Ikbal, John W. McDonough, Multi-Source Far-Distance Microphone Selection and Combination for Automatic Transcription of Lectures conference of the international speech communication association. ,(2006)
Y. Shimizu, S. Kajita, K. Takeda, F. Itakura, Speech recognition based on space diversity using distributed multi-microphone international conference on acoustics, speech, and signal processing. ,vol. 3, pp. 1747- 1750 ,(2000) , 10.1109/ICASSP.2000.862090
Henrik Schulz, José A. R. Fonollosa, David Rybach, Transcription of Catalan Broadcast Conversation Text, Speech and Dialogue. pp. 154- 161 ,(2009) , 10.1007/978-3-642-04208-9_24
Matthias Wölfel, Channel selection by class separability measures for automatic transcriptions on distant microphones. conference of the international speech communication association. pp. 582- 585 ,(2007)
T. Houtgast, H. J. M. Steeneken, A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria Journal of the Acoustical Society of America. ,vol. 77, pp. 1069- 1077 ,(1985) , 10.1121/1.392224
Yasunari Obuchi, Multiple-microphone robust speech recognition using decoder-based channel selection. conference of the international speech communication association. pp. 52- ,(2004)