Compensation of channel and noise distortions combining normalization and speech enhancement techniques

作者: Xavier Menéndez-Pidal , Ruxin Chen , Duanpei Wu , Mick Tanaka

DOI: 10.1016/S0167-6393(00)00049-2

关键词:

摘要: This paper introduces two techniques to obtain robust speech recognition devices in mismatch conditions (additive noise and channel mismatch). The first algorithm, adaptive Gaussian attenuation algorithm (AGA), is a enhancement technique developed reduce the effects of additive background wide range signal ratio (SNR) conditions. closely related classical spectral subtraction (SS) scheme, but proposed model mean variance are used better attenuate noise. Information SNR also introduced provide adaptability at different second cepstral normalization variance-scaling (CMNVS), an extension (CMN) features convolutive distortions. requirements analyzed paper. Combining both relative distortion were reduced 90% on HTIMIT task 77% using TIMIT database mixed with car noises

参考文章(17)
Sangita Tibrewala, Hynek Hermansky, Multi-band and adaptation approaches to robust speech recognition. conference of the international speech communication association. ,(1997)
Volker Schless, Fritz Class, SNR-dependent flooring and noise overestimation for joint application of spectral subtraction and model combination. conference of the international speech communication association. ,(1998)
J.L. Gauvain, J.J. Gangolf, L. Lamel, Speech recognition for an information kiosk international conference on spoken language processing. ,vol. 2, pp. 849- 852 ,(1996) , 10.1109/ICSLP.1996.607734
B. Milner, Inclusion of temporal information into features for speech recognition international conference on spoken language processing. ,vol. 1, pp. 256- 259 ,(1996) , 10.1109/ICSLP.1996.607093
Brian A. Hanson, Ted H. Applebaum, Jean-Claude Junqua, Spectral Dynamics for Speech Recognition Under Adverse Conditions Springer, Boston, MA. pp. 331- 356 ,(1996) , 10.1007/978-1-4613-1367-0_14
P. Lockwood, J. Boudy, Experiments with a Nonlinear Spectral Subtractor (NSS), Hidden Markov Models and the projection, for robust speech recognition in cars conference of the international speech communication association. ,vol. 11, pp. 215- 228 ,(1992) , 10.1016/0167-6393(92)90016-Z
Fei Xie, Dirk Van Compernolle, Speech enhancement by spectral magnitude estimation—a unifying approach Speech Communication. ,vol. 19, pp. 89- 104 ,(1996) , 10.1016/0167-6393(96)00022-2
K.-F. Lee, H.-W. Hon, Speaker-independent phone recognition using hidden Markov models IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 37, pp. 1641- 1648 ,(1989) , 10.1109/29.46546
S. Verdu, Fifty years of Shannon theory IEEE Transactions on Information Theory. ,vol. 44, pp. 13- 34 ,(1998) , 10.1109/18.720531