Front-End, Back-End, and Hybrid Techniques for Noise-Robust Speech Recognition

作者: Li Deng

DOI: 10.1007/978-3-642-21317-5_4

关键词: Bayesian probabilityUncertainty handlingPhase factorSpeech recognitionDecision ruleFront and back endsComputer scienceClassification ruleThread (computing)Robustness (computer science)

摘要: Noise robustness has long been an active area of research that captures significant interest from speech recognition researchers and developers. In this chapter, with a focus on the problem uncertainty handling in robust recognition, we use Bayesian framework as common thread for connecting, analyzing, categorizing number popular approaches to solutions pursued recent past. The topics covered chapter include 1) decision rules unreliable features model parameters; 2) principled ways computing feature using structured distortion models; 3) phase factor advanced compensation; 4) novel perspective compensation special implementation general predictive classification rule capitalizing parameter uncertainty; 5) taxonomy noise techniques two distinct axes, vs. domain unstructured transformation; 6) noise-adaptive training hybrid feature-model its various forms extension.

参考文章(91)
Alex Acero, Li Deng, John C. Platt, Hagai Attias, A new method for speech denoising and robust speech recognition using probabilistic models for clean speech and for noise. conference of the international speech communication association. pp. 1903- 1906 ,(2001)
Dorothea Kolossa, Ramón Fernandez Astudillo, Reinhold Orglmeister, Accounting for the uncertainty of speech estimates in the complex domain for minimum mean square error speech enhancement. conference of the international speech communication association. pp. 2491- 2494 ,(2009)
Qiang Huo, Yu Hu, Irrelevant variability normalization based HMM training using VTS approximation of an explicit model of environmental distortions. conference of the international speech communication association. pp. 1042- 1045 ,(2007)
S. Khudanpur, Li Deng, N. Morgan, J. Glass, C.-H. Lee, J. Baker, Updated MINDS Report on Speech Recognition and Understanding IEEE Signal Processing Magazine. ,vol. 26, ,(2009)
V. Stouten, H. Van hamme, P. Wambacq, Effect of phase-sensitive environment model and higher order VTS on noisy speech feature enhancement [speech recognition applications] international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 433- 436 ,(2005) , 10.1109/ICASSP.2005.1415143
M. J. F. Gales, Hank Liao, Issues with Uncertainty Decoding for Noise Robust Speech Recognition conference of the international speech communication association. ,(2006)
M. J. F. Gales, Model-Based Approaches to Handling Uncertainty Robust Speech Recognition of Uncertain or Missing Data. pp. 101- 125 ,(2011) , 10.1007/978-3-642-21317-5_5
Reinhold Haeb-Umbach, Volker Leutnant, An analytic derivation of a phase-sensitive observation model for noise robust speech recognition conference of the international speech communication association. pp. 2395- 2398 ,(2009)
Trausti T. Kristjansson, Alex Acero, Li Deng, Jerry Zhang, HMM adaptation using vector taylor series for noisy speech recognition. conference of the international speech communication association. pp. 869- 872 ,(2000)