Robust speech recognition based on a Bayesian prediction approach

作者: Hui Jiang , K. Hirose , Qiang Huo

DOI: 10.1109/89.771309

关键词:

摘要: We study a category of robust speech recognition problem in which mismatches exist between training and testing conditions, no accurate knowledge the mismatch mechanism is available. The only available information test data along with set pretrained Gaussian mixture continuous density hidden Markov models (CDHMMs). investigate from viewpoint Bayesian prediction. A simple prior distribution, namely constrained uniform adopted to characterize uncertainty mean vectors CDHMMs. Two methods, model compensation technique based on predictive decision strategy called Viterbi classification are studied. proposed methods compared conventional decoding algorithm speaker-independent experiments isolated digits TI connected digit strings (TIDTGITS), where conditions caused by: (1) additive white noise, (2) each 25 types actual ambient noises, (3) gender difference. experimental results show that distribution techniques help improve performance robustness under examined conditions.

参考文章(23)
Kuldip K. Paliwal, Chin-Hui Lee, Frank K. Soong, Automatic Speech and Speaker Recognition: Advanced Topics Kluwer Academic Publishers. ,(1999)
Lalit R. Bahl, Frederick Jelinek, Robert L. Mercer, A Maximum Likelihood Approach to Continuous Speech Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. PAMI-5, pp. 179- 190 ,(1983) , 10.1109/TPAMI.1983.4767370
Qiang Huo, Chorkin Chan, Chin-Hui Lee, On-line adaptation of the SCHMM parameters based on the segmental quasi-Bayes learning for speech recognition IEEE Transactions on Speech and Audio Processing. ,vol. 4, pp. 141- 144 ,(1996) , 10.1109/89.486065
B.H. Juang, Speech recognition in adverse environments Computer Speech & Language. ,vol. 5, pp. 275- 294 ,(1991) , 10.1016/0885-2308(91)90011-E
A. Nadas, Optimal solution of a training problem in speech recognition IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 33, pp. 326- 329 ,(1985) , 10.1109/TASSP.1985.1164513
Yifan Gong, Speech recognition in noisy environments: a survey Speech Communication. ,vol. 16, pp. 261- 291 ,(1995) , 10.1016/0167-6393(94)00059-J
A. Sankar, Chin-Hui Lee, A maximum-likelihood approach to stochastic matching for robust speech recognition IEEE Transactions on Speech and Audio Processing. ,vol. 4, pp. 190- 202 ,(1996) , 10.1109/89.496215
Qiang Huo, Hui Jiang, Chin-Hui Lee, A Bayesian predictive classification approach to robust speech recognition international conference on acoustics, speech, and signal processing. ,vol. 2, pp. 1547- 1550 ,(1997) , 10.1109/ICASSP.1997.596246