作者: Erik McDermott , Shinji Watanabe , Atsushi Nakamura
DOI:
关键词: Hidden Markov model 、 Computer science 、 Discriminative model 、 Hinge loss 、 Support vector machine 、 Speech recognition
摘要: Abstract Using the central observation that margin-based weightedclassification error (modeled using Minimum Phone Error(MPE)) corresponds to derivative with respect mar-gin term of hinge loss MaximumMutual Information (MMI)), this article subsumes and extendsmargin-based MPE MMI within a broader framework inwhich objective function is an integral over arange margin values. Applying Fundamental Theorem ofCalculus,thisintegraliseasilyevaluatedusingfinitedifferencesof functionals; lattice-based training new crite-rion can then be carried out differences gradi-ents. Experimental results comparing withmargin-based MMI, MCE on Corpus Sponta-neous Japanese MIT OpenCourseWare/MIT-World cor-pus are presented. 1. Introduction The field discriminative for speech recognition haswitnessed considerable activity in recent years. appeal ofminimizingphoneorworderrorratherthanstringerrorhasmo-tivated transition from well-known string-level methods suchas [1][2] error-weighted approaches, such asMPE [3][4]. More recently, there has been surge proposalsfor“largemargin”approachestohiddenMarkovmodel(HMM)design, as “large-margin HMM” [5], “soft es-timation” [6], incrementally shifted [7]. Sha andSaul [8] made important proposal fine-grained er-ror measure, Hamming distance between candidaterecognition strings, itself directly incorporated into HMM-based learning. It turns introducinga multiplies easily bebrought based HMM well,simply by adding margin-scaled local frame/phone/word errorto lattice arc log-likelihoods during Forward-Backward com-putation [9][10][11]. This approach links original use ofmargin context machine learning (e.g. Support VectorMachines (SVMs)) “tried-and-tested” frameworks large-scale withwell-understood optimization large-scaleASR tasks. Benefits performance tasks havebeen reported MPE, thoughit appears relative gains larger than MPE[10][11].Aiming at leveraging benefits thecontextofMPE-styleerror-weightedHMMtraining,thisarticlepresents unification trainingbased novel concept: