Effect of phase-sensitive environment model and higher order VTS on noisy speech feature enhancement [speech recognition applications]

作者: V. Stouten , H. Van hamme , P. Wambacq

DOI: 10.1109/ICASSP.2005.1415143

关键词:

摘要: Model-based techniques for robust speech recognition often require the statistics of noisy speech. In this paper, we propose two modifications to obtain more accurate versions combined HMM (starting from a clean and noise model). Usually, phase difference between is neglected in acoustic environment model. However, show how phase-sensitive model can be efficiently integrated context multi-stream model-based feature enhancement gives rise covariance matrices Also, by expanding vector Taylor series up second order term, an improved mean obtained. Finally, explain front-end itself preprocessing training data. Recognition results on Aurora4 database illustrate effect robustness each these modifications.

参考文章(7)
Hugo Van hamme, Veronique Stouten, Patrick Wambacq, Accounting for the uncertainty of speech estimates in the context of model-based feature enhancement international conference on spoken language processing. pp. 105- 108 ,(2004)
C. Couvreur, H. Van Hamme, Model-based feature enhancement for noisy speech recognition international conference on acoustics, speech, and signal processing. ,vol. 3, pp. 1719- 1722 ,(2000) , 10.1109/ICASSP.2000.862083
M.J.F. Gales, S.J. Young, Robust continuous speech recognition using parallel model combination IEEE Transactions on Speech and Audio Processing. ,vol. 4, pp. 352- 359 ,(1996) , 10.1109/89.536929
B. Raj, R. Singh, R. Stern, On tracking noise with linear dynamical system models international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 965- 968 ,(2004) , 10.1109/ICASSP.2004.1326148
P.J. Moreno, B. Raj, R.M. Stern, A vector Taylor series approach for environment-independent speech recognition international conference on acoustics speech and signal processing. ,vol. 2, pp. 733- 736 ,(1996) , 10.1109/ICASSP.1996.543225
V. Stouten, H. Van Hamme, P. Wambacq, Joint removal of additive and convolutional noise with model-based feature enhancement international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 949- 952 ,(2004) , 10.1109/ICASSP.2004.1326144