Margin-space integration of MPE loss via differencing of MMI functionals for generalized error-weighted discriminative training

作者： Erik McDermott , Shinji Watanabe , Atsushi Nakamura

DOI:

关键词: Hidden Markov model 、 Computer science 、 Discriminative model 、 Hinge loss 、 Support vector machine 、 Speech recognition

摘要: Abstract Using the central observation that margin-based weightedclassiﬁcation error (modeled using Minimum Phone Error(MPE)) corresponds to derivative with respect mar-gin term of hinge loss MaximumMutual Information (MMI)), this article subsumes and extendsmargin-based MPE MMI within a broader framework inwhich objective function is an integral over arange margin values. Applying Fundamental Theorem ofCalculus,thisintegraliseasilyevaluatedusingﬁnitedifferencesof functionals; lattice-based training new crite-rion can then be carried out differences gradi-ents. Experimental results comparing withmargin-based MMI, MCE on Corpus Sponta-neous Japanese MIT OpenCourseWare/MIT-World cor-pus are presented. 1. Introduction The ﬁeld discriminative for speech recognition haswitnessed considerable activity in recent years. appeal ofminimizingphoneorworderrorratherthanstringerrorhasmo-tivated transition from well-known string-level methods suchas [1][2] error-weighted approaches, such asMPE [3][4]. More recently, there has been surge proposalsfor“largemargin”approachestohiddenMarkovmodel(HMM)design, as “large-margin HMM” [5], “soft es-timation” [6], incrementally shifted [7]. Sha andSaul [8] made important proposal ﬁne-grained er-ror measure, Hamming distance between candidaterecognition strings, itself directly incorporated into HMM-based learning. It turns introducinga multiplies easily bebrought based HMM well,simply by adding margin-scaled local frame/phone/word errorto lattice arc log-likelihoods during Forward-Backward com-putation [9][10][11]. This approach links original use ofmargin context machine learning (e.g. Support VectorMachines (SVMs)) “tried-and-tested” frameworks large-scale withwell-understood optimization large-scaleASR tasks. Beneﬁts performance tasks havebeen reported MPE, thoughit appears relative gains larger than MPE[10][11].Aiming at leveraging beneﬁts thecontextofMPE-styleerror-weightedHMMtraining,thisarticlepresents uniﬁcation trainingbased novel concept:

uni-trier.de 本地加速

elsevier.com 本地加速

uni-trier.de PDF 下载加速

researchgate.net PDF 下载加速

参考文章(13)

George Saon, Daniel Povey, Penalty function maximization for large margin HMM training. conference of the international speech communication association. pp. 920- 923 ,(2008)

Erik McDermott, Atsushi Nakamura, String and Lattice based Discriminative Training for the Corpus of Spontaneous Japanese Lecture Transcription Task conference of the international speech communication association. pp. 2081- 2084 ,(2007)

Georg Heigold, Thomas Deselaers, Ralf Schlüter, Hermann Ney, Modified MMI/MPE Proceedings of the 25th international conference on Machine learning - ICML '08. pp. 384- 391 ,(2008) , 10.1145/1390156.1390205

Jinyu Li, Zhi-Jie Yan, Chin-Hui Lee, Ren-Hua Wang, A study on soft margin estimation for LVCSR ieee automatic speech recognition and understanding workshop. pp. 268- 271 ,(2007) , 10.1109/ASRU.2007.4430122

Daniel Povey, Dimitri Kanevsky, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Karthik Visweswariah, Boosted MMI for model and feature-space discriminative training international conference on acoustics, speech, and signal processing. pp. 4057- 4060 ,(2008) , 10.1109/ICASSP.2008.4518545

Atsushi Nakamura, Erik McDermott, Shinji Watanabe, Shigeru Katagiri, A unified view for discriminative objective functions based on negative exponential of difference measure between strings international conference on acoustics, speech, and signal processing. pp. 1633- 1636 ,(2009) , 10.1109/ICASSP.2009.4959913

Xinwei Li, Hui Jiang, Chaojun Liu, Large margin HMMs for speech recognition international conference on acoustics, speech, and signal processing. ,vol. 5, pp. 513- 516 ,(2005) , 10.1109/ICASSP.2005.1416353

Fei Sha, Lawrence K. Saul, Large Margin Hidden Markov Models for Automatic Speech Recognition neural information processing systems. ,vol. 19, pp. 1249- 1256 ,(2006)

D. Povey, P.C. Woodland, Minimum Phone Error and I-smoothing for improved discriminative training international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 105- 108 ,(2002) , 10.1109/ICASSP.2002.5743665

10.

Ralf Schlüter, Hermann Ney, Lars Haferkamp, Wolfgang Macherey, Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition conference of the international speech communication association. pp. 2133- 2136 ,(2005)

Margin-space integration of MPE loss via differencing of MMI functionals for generalized error-weighted discriminative training

来源期刊

我的账户

Margin-space integration of MPE loss via differencing of MMI functionals for generalized error-weighted discriminative training

来源期刊

相似文章 10

我的账户