Large scale discriminative training for speech recognition

作者： D Povey , PC Woodland

DOI:

关键词:

摘要: This paper describes, and evaluates on a large scale, the lattice based framework for discriminative training of vocabulary speech recognition systems Gaussian mixture hidden Markov models (HMMs). The concentrates maximum mutual information estimation (MMIE) criterion which has been used to train HMM conversational telephone transcription using up 265 hours data. These experiments represent largest-scale application techniques authors are aware, have led significant reductions in word error rate both triphone quinphone HMMs compared our best trained likelihood estimation. MMIE latticebased implementation used; ensuring improved generalisation; interactions with adaptation all discussed. Furthermore several variations scheme introduced aim reducing over-training.

cam.ac.uk 本地加速

danielpovey.com PDF 下载加速

cam.ac.uk PDF 下载加速

isca-speech.org PDF 下载加速

参考文章(27)

G Evermann, PC Woodland, Posterior probability decoding, confidence estimation and system combination NIST: National Institute of Standards and Technology. ,(2000)

A Tuerk, TR Niesler, GL Moore, T Hain, Ewd Whittaker, D Povey, PC Woodland, The 1998 HTK broadcast news transcription system: development and results DARPA. ,(1999)

Andreas Stolcke, Lidia Mangu, Eric Brill, Finding consensus among words : Lattice-based word error minimization conference of the international speech communication association. ,(1999)

Yves Normandin, Hidden Markov models, maximum mutual information estimation, and the speech recognition problem McGill University. ,(1992)

B. Merialdo, Phonetic recognition using hidden Markov models and maximum mutual information training international conference on acoustics speech and signal processing. pp. 111- 114 ,(1988) , 10.1109/ICASSP.1988.196524

L. Bahl, P. Brown, P. de Souza, R. Mercer, Maximum mutual information estimation of hidden Markov model parameters for speech recognition international conference on acoustics, speech, and signal processing. ,vol. 11, pp. 49- 52 ,(1986) , 10.1109/ICASSP.1986.1169179

A. Nadas, A decision theorectic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 31, pp. 814- 817 ,(1983) , 10.1109/TASSP.1983.1164173

M.J.F. Gales, Maximum likelihood linear transformations for HMM-based speech recognition Computer Speech & Language. ,vol. 12, pp. 75- 98 ,(1998) , 10.1006/CSLA.1998.0043

V Valtchev, J.J Odell, P.C Woodland, S.J Young, MMIE training of large vocabulary recognition systems Speech Communication. ,vol. 22, pp. 303- 314 ,(1997) , 10.1016/S0167-6393(97)00029-0

10.

M.J.F. Gales, P.C. Woodland, Mean and variance adaptation within the MLLR framework Computer Speech & Language. ,vol. 10, pp. 249- 264 ,(1996) , 10.1006/CSLA.1996.0013

Large scale discriminative training for speech recognition

来源期刊

我的账户

Large scale discriminative training for speech recognition

来源期刊

相似文章 10

我的账户