Large Margin Discriminative Semi-Markov Model for Phonetic Recognition

作者: Sungwoong Kim , Sungrack Yun , Chang D. Yoo

DOI: 10.1109/TASL.2011.2108286

关键词:

摘要: This paper considers a large margin discriminative semi-Markov model (LMSMM) for phonetic recognition. The hidden Markov (HMM) framework that is often used recognition assumes only local statistical dependencies between adjacent observations, and it to predict label each observation without explicit phone segmentation. On the other hand, (SMM) allows simultaneous segmentation labeling of sequential data based on segment-based Markovian structure among all observations within segment. For which inherently joint problem, SMM has potential perform better than HMM at expense slight increase in computational complexity. considered this non-probabilistic discriminant function linear feature map attempts capture long-range observations. parameters are estimated by learning structured prediction. parameter estimation problem hand leads an optimization with many constraints, constrained solved using stochastic gradient descent algorithm. proposed LMSMM outperformed TIMIT task.

参考文章(61)
Ralf Schlüter, Hermann Ney, Georg Heigold, On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields. conference of the international speech communication association. pp. 1721- 1724 ,(2007)
Hynek Hermansky, Sangita Sharma, TRAPS - classifiers of temporal patterns. conference of the international speech communication association. ,(1998)
J. Andrew Bagnell, Martin A. Zinkevich, Nathan D. Ratliff, Approximate) Subgradient Methods for Structured Prediction international conference on artificial intelligence and statistics. pp. 380- 387 ,(2007)
Joe Frankel, Linear dynamic models for automatic speech recognition The University of Edinburgh. College of Science and Engineering. School of Informatics. ,(2004)
Fei Sha, Lawrence K. Saul, Large margin training of acoustic models for speech recognition Large margin training of acoustic models for speech recognition. pp. 156- 156 ,(2007)
S. Roucos, M. Ostendorf, H. Gish, A. Derr, Stochastic segment modelling using the estimate-maximize algorithm (speech recognition) international conference on acoustics speech and signal processing. pp. 127- 130 ,(1988) , 10.1109/ICASSP.1988.196528
J. Andrew Bagnell, Martin A. Zinkevich, Nathan D. Ratliff, Online) Subgradient Methods for Structured Prediction ,(2007)
Andrew K. Halberstadt, James R. Glass, Heterogeneous acoustic measurements for phonetic classification 1. conference of the international speech communication association. ,(1997)