Warped-twice minimum variance distortionless response spectral estimation

作者: Matthias Wolfel

DOI:

关键词:

摘要: This paper describes a novel extension to warped minimum variance distortionless response (MVDR) spectral estimation which allows steer the resolution of envelope lower or higher frequencies while keeping overall estimate and frequency axis fixed. effect can be achieved by introduction second bilinear transformation warped-MVDR estimation, but now in domain as opposed first is applied time domain, compensation step adjust for pre-emphasis both transformations. In feature extraction process an automatic speech recognition system this emphasize classification relevant characteristics dropping irrelevant features according signal analyze, e.g. vowels fricatives have different therefore should treated differently. We compared on evaluation data Rich Transcription 2005 Spring Meeting Recognition Evaluation got word error rate reduction from 28.2% 27.5%.

参考文章(11)
Yoshihisa Nakatoh, Makoto Nishizaki, Shinichi Yoshizawa, Maki Yamada, An adaptive MEL-LPC analysis for speech recognition. conference of the international speech communication association. ,(2004)
Matthias Wölfel, Alex Waibel, John W. McDonough, Minimum Variance Distortionless Response on a Warped Frequency Scale conference of the international speech communication association. ,(2003)
Hynek Hermansky, Narendranath Malayath, Data-driven methods for extracting features from speech Oregon Graduate Institute of Science and Technology. ,(2000)
Jonathan G. Fiscus, Nicolas Radde, John S. Garofolo, Audrey Le, Jerome Ajot, Christophe Laprun, The Rich Transcription 2005 Spring Meeting Recognition Evaluation Machine Learning for Multimodal Interaction. pp. 369- 389 ,(2006) , 10.1007/11677482_32
H. Matsumoto, M. Moroto, Evaluation of mel-LPC cepstrum in a large vocabulary continuous speech recognition international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 117- 120 ,(2001) , 10.1109/ICASSP.2001.940781
Alan V. Oppenheim, Ronald W. Schafer, Discrete-Time Signal Processing ,(1989)
S. Dharanipragada, B.D. Rao, MVDR based feature extraction for robust speech recognition international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 309- 312 ,(2001) , 10.1109/ICASSP.2001.940829
B. Musicus, Fast MLM power spectrum estimation from uniformly spaced correlations IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 33, pp. 1333- 1335 ,(1985) , 10.1109/TASSP.1985.1164696
M. Wolfel, J. McDonough, Minimum variance distortionless response spectral estimation IEEE Signal Processing Magazine. ,vol. 22, pp. 117- 126 ,(2005) , 10.1109/MSP.2005.1511829
M.N. Murthi, B.D. Rao, All-pole modeling of speech based on the minimum variance distortionless response spectrum IEEE Transactions on Speech and Audio Processing. ,vol. 8, pp. 221- 239 ,(2000) , 10.1109/89.841206