Large scale MMIE training for conversational telephone speech recognition

作者: D Povey , PC Woodland

DOI:

关键词:

摘要: This paper describes a lattice-based framework for maximum mutual information estimation (MMIE) of HMM parameters which has been used to train systems conversational telephone speech transcription using up 265 hours training data. These experiments represent the largest-scale application discriminative techniques recognition authors are aware, and have led significant reductions in word error rate both triphone quinphone HMMs compared our best models trained likelihood estimation. The use MMIE was key contributer performance CU-HTK March 2000 Hub5 evaluation system.

参考文章(7)
L. Bahl, P. Brown, P. de Souza, R. Mercer, Maximum mutual information estimation of hidden Markov model parameters for speech recognition international conference on acoustics, speech, and signal processing. ,vol. 11, pp. 49- 52 ,(1986) , 10.1109/ICASSP.1986.1169179
V Valtchev, J.J Odell, P.C Woodland, S.J Young, MMIE training of large vocabulary recognition systems Speech Communication. ,vol. 22, pp. 303- 314 ,(1997) , 10.1016/S0167-6393(97)00029-0
M.J.F. Gales, P.C. Woodland, Mean and variance adaptation within the MLLR framework Computer Speech & Language. ,vol. 10, pp. 249- 264 ,(1996) , 10.1006/CSLA.1996.0013
P.S. Gopalakrishnan, D. Kanevsky, A. Nadas, D. Nahamoo, An inequality for rational functions with applications to some statistical estimation problems IEEE Transactions on Information Theory. ,vol. 37, pp. 107- 113 ,(1991) , 10.1109/18.61108
T. Hain, P.C. Woodland, T.R. Niesler, E.W.D. Whittaker, The 1998 HTK system for transcription of conversational telephone speech international conference on acoustics speech and signal processing. ,vol. 1, pp. 57- 60 ,(1999) , 10.1109/ICASSP.1999.758061
G. Evermann, T. Hain, D. Povey, P. C. Woodland, THE CU-HTK MARCH 2000 HUB5E TRANSCRIPTION SYSTEM NIST: National Institute of Standards and Technology. ,(2000)