A discriminative training algorithm for hidden Markov models

作者: A. Ben-Yishai , D. Burshtein

DOI: 10.1109/TSA.2003.822639

关键词:

摘要: We introduce a discriminative training algorithm for the estimation of hidden Markov model (HMM) parameters. This is based on an approximation maximum mutual information (MMI) objective function and its maximization in technique similar to expectation-maximization (EM) algorithm. The implemented by simple modification standard Baum-Welch algorithm, can be applied speech recognition as well word-spotting systems. Three tasks were tested: isolated digit noisy environment, connected environment word-spotting. In all significant improvement over likelihood (ML) was observed. also compared new commonly used extended MMI our tests showed advantages terms both performance computational complexity.

参考文章(18)
D Povey, PC Woodland, Large scale MMIE training for conversational telephone speech recognition NIST: National Institute of Standards and Technology. ,(2000)
Andreas Stolcke, John Butzberger, Horacio Franco, Jing Zheng, Improved maximum mutual information estimation training of continuous density HMMs. conference of the international speech communication association. pp. 679- 682 ,(2001)
L. Bahl, P. Brown, P. de Souza, R. Mercer, Maximum mutual information estimation of hidden Markov model parameters for speech recognition international conference on acoustics, speech, and signal processing. ,vol. 11, pp. 49- 52 ,(1986) , 10.1109/ICASSP.1986.1169179
V Valtchev, J.J Odell, P.C Woodland, S.J Young, MMIE training of large vocabulary recognition systems Speech Communication. ,vol. 22, pp. 303- 314 ,(1997) , 10.1016/S0167-6393(97)00029-0
A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum Likelihood from Incomplete Data Via theEMAlgorithm Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 39, pp. 1- 22 ,(1977) , 10.1111/J.2517-6161.1977.TB01600.X
Leonard E. Baum, Ted Petrie, George Soules, Norman Weiss, A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains Annals of Mathematical Statistics. ,vol. 41, pp. 164- 171 ,(1970) , 10.1214/AOMS/1177697196
P.S. Gopalakrishnan, D. Kanevsky, A. Nadas, D. Nahamoo, An inequality for rational functions with applications to some statistical estimation problems IEEE Transactions on Information Theory. ,vol. 37, pp. 107- 113 ,(1991) , 10.1109/18.61108
A. Nadas, D. Nahamoo, M.A. Picheny, On a model-robust training method for speech recognition IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 36, pp. 1432- 1436 ,(1988) , 10.1109/29.90371
Y. Normandin, R. Cardin, R. De Mori, High-performance connected digit recognition using maximum mutual information estimation IEEE Transactions on Speech and Audio Processing. ,vol. 2, pp. 299- 311 ,(1994) , 10.1109/89.279279