Formulation of the REMOS concept from an uncertainty decoding perspective

作者: Roland Maas , Walter Kellermann , Armin Sehr , Takuya Yoshioka , Marc Delcroix

DOI: 10.1109/ICDSP.2013.6622698

关键词:

摘要: In this paper, we introduce a new formulation of the REMOS (REverberation MOdeling for Speech recognition) concept from an uncertainty decoding perspective. Based on convolutive observation model that relaxes conditional independence assumption hidden Markov models, effectively adapts automatic speech recognition (ASR) systems to noisy and strongly reverberant environments. While approaches are typically designed operate irrespectively employed routine ASR system, explicitly considers additional information provided by Viterbi decoder. contrast previous publications concept, provide conclusive derivation its using Bayesian network representation in order prove inherent character.

参考文章(26)
Eberhard Hänsler, Gerhard Schmidt, Acoustic Echo and Noise Control: A Practical Approach Wiley-Interscience. ,(2004) , 10.1002/0471678406
Jon A. Arrowood, Mark A. Clements, Using observation uncertainty in HMM decoding. conference of the international speech communication association. ,(2002)
M. J. F. Gales, Model-based techniques for noise robust speech recognition Ph. D Dissertation, University of Cambridge. ,(1995)
Li Deng, Front-End, Back-End, and Hybrid Techniques for Noise-Robust Speech Recognition Robust Speech Recognition of Uncertain or Missing Data. pp. 67- 99 ,(2011) , 10.1007/978-3-642-21317-5_4
Luca Rigazio, David Kryze, Haitian Xu, Vector Taylor series based joint uncertainty decoding conference of the international speech communication association. ,(2006)
Reinhold Haeb-Umbach, Uncertainty Decoding and Conditional Bayesian Estimation Robust Speech Recognition of Uncertain or Missing Data. pp. 9- 33 ,(2011) , 10.1007/978-3-642-21317-5_2
Pedro J. Moreno, Speech recognition in noisy environments Carnegie Mellon University. ,(1996)
Takuya Yoshioka, Armin Sehr, Marc Delcroix, Keisuke Kinoshita, Roland Maas, Tomohiro Nakatani, Walter Kellermann, Making Machines Understand Us in Reverberant Rooms: Robustness Against Reverberation for Automatic Speech Recognition IEEE Signal Processing Magazine. ,vol. 29, pp. 114- 126 ,(2012) , 10.1109/MSP.2012.2205029
C.K. Raut, T. Nishimoto, S. Sagayama, Maximum likelihood based HMM state filtering approach to model adaptation for long reverberation ieee automatic speech recognition and understanding workshop. pp. 353- 356 ,(2005) , 10.1109/ASRU.2005.1566517
Roland Maas, Akshaya Thippur, Armin Sehr, Walter Kellermann, An uncertainty decoding approach to noise- and reverberation-robust speech recognition international conference on acoustics, speech, and signal processing. pp. 7388- 7392 ,(2013) , 10.1109/ICASSP.2013.6639098