Fusion of multiple uncertainty estimators and propagators for noise robust ASR

作者： Dung T. Tran , Emmanuel Vincent , Denis Jouvet

DOI: 10.1109/ICASSP.2014.6854657

关键词:

摘要: Uncertainty decoding has been successfully used for speech recognition in highly nonstationary noise environments. Yet, accurate estimation of the uncertainty on denoised signals and propagation to features remain difficult. In this work, we propose fuse estimates obtained from different estimators propagators by linear combination. The fusion coefficients are optimized minimizing a measure divergence with oracle development data. Using Kullback-Leibler divergence, obtain 18% relative error rate reduction 2nd CHiME Challenge respect conventional decoding, that is about twice as much achieved best single estimator propagator.

参考文章(18)

Alex Acero, Mike Plumpe, Li Deng, Xuedong Huang, Large-vocabulary speech recognition under adverse acoustic environments. conference of the international speech communication association. pp. 806- 809 ,(2000)

Li Deng, Front-End, Back-End, and Hybrid Techniques for Noise-Robust Speech Recognition Robust Speech Recognition of Uncertain or Missing Data. pp. 67- 99 ,(2011) , 10.1007/978-3-642-21317-5_4

Daniel D. Lee, H. Sebastian Seung, Learning the parts of objects by non-negative matrix factorization Nature. ,vol. 401, pp. 788- 791 ,(1999) , 10.1038/44565

Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Atsunori Ogawa, Takaaki Hori, Shinji Watanabe, Masakiyo Fujimoto, Takuya Yoshioka, Takanobu Oba, Yotaro Kubo, Mehrez Souden, Seong-Jun Hahm, Atsushi Nakamura, Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds Computer Speech & Language. ,vol. 27, pp. 851- 873 ,(2013) , 10.1016/J.CSL.2012.07.006

Alexey Ozerov, Emmanuel Vincent, Frédéric Bimbot, A General Flexible Framework for the Handling of Prior Information in Audio Source Separation IEEE Transactions on Audio, Speech, and Language Processing. ,vol. 20, pp. 1118- 1133 ,(2012) , 10.1109/TASL.2011.2172425

Alexey Ozerov, Mathieu Lagrange, Emmanuel Vincent, Uncertainty-based learning of acoustic models from noisy data Computer Speech & Language. ,vol. 27, pp. 874- 894 ,(2013) , 10.1016/J.CSL.2012.07.002

Dung T. Tran, Emmanuel Vincent, Denis Jouvet, Extension of uncertainty propagation to dynamic MFCCS for noise robust ASR international conference on acoustics, speech, and signal processing. pp. 5507- 5511 ,(2014) , 10.1109/ICASSP.2014.6854656

Raul Kompass, A Generalized Divergence Measure for Nonnegative Matrix Factorization Neural Computation. ,vol. 19, pp. 780- 791 ,(2007) , 10.1162/NECO.2007.19.3.780

Dorothea Kolossa, Ramon Fernandez Astudillo, Eugen Hoffmann, Reinhold Orglmeister, Independent component analysis and time-frequency masking for speech recognition in multitalker conditions Eurasip Journal on Audio, Speech, and Music Processing. ,vol. 2010, pp. 651420- ,(2010) , 10.1155/2010/651420

10.

H. Liao, M. J. F Gales, Adaptive Training with Joint Uncertainty Decoding for Robust Recognition of Noisy Data international conference on acoustics, speech, and signal processing. ,vol. 4, pp. 389- 392 ,(2007) , 10.1109/ICASSP.2007.366931

Fusion of multiple uncertainty estimators and propagators for noise robust ASR

来源期刊

我的账户

Fusion of multiple uncertainty estimators and propagators for noise robust ASR

来源期刊

相似文章 10

我的账户