Increasing the mixture components of non-uniform HMM structures based on a variational Bayesian approach.

作者: Takatoshi Jitsuhiro , Satoshi Nakamura

DOI:

关键词: Maximum likelihoodOverfittingVariational message passingComputer scienceAlgorithmHidden Markov modelVariable-order Bayesian networkState (computer science)Bayesian probability

摘要: We propose using the Variational Bayesian (VB) approach for automatically creating non-uniform, context-dependent HMM topologies. Although Maximum Likelihood (ML) criterion is generally used to create topologies, it has an overfitting problem. Recently, avoid this problem, VB been applied acoustic models speech recognition. introduce Successive State Splitting (SSS) algorithm, which can both contextual and temporal variations HMMs. Experimental results show that proposed method a more efficient model than original method. Furthermore, we evaluated increase number of mixture components by considering structures. The obtained best performance with smaller in comparison ML based methods.

参考文章(10)
Takatoshi Jitsuhiro, Tomoko Matsui, Satoshi Nakamura, Automatic generation of non-uniform context-dependent HMM topologies based on the MDL criterion. conference of the international speech communication association. ,(2003)
Koichi Shinoda, Takao Watanabe, Acoustic modeling based on the MDL principle for speech recognition. conference of the international speech communication association. ,(1997)
Hagai Attias, Inferring parameters and structure of latent variable models by variational bayes uncertainty in artificial intelligence. pp. 21- 30 ,(1999)
Naonori Ueda, Shinji Watanabe, Atsushi Nakamura, Yasuhiro Minami, BAYESIAN ACOUSTIC MODELING FOR SPONTANEOUS SPEECH RECOGNITION ,(2004)
Fabio Valente, Variational bayesian GMM for speech recognition conference of the international speech communication association. ,(2003)
M. Ostendorf, H. Singer, HMM topology design using maximum likelihood successive state splitting Computer Speech & Language. ,vol. 11, pp. 17- 41 ,(1997) , 10.1006/CSLA.1996.0021
T. Jitsuhiro, S. Nakamura, Automatic generation of non-uniform HMM structures based on variational Bayesian approach international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 805- 808 ,(2004) , 10.1109/ICASSP.2004.1326108
H. Yamamoto, Y. Sagisaka, Multi-class composite N-gram based on connection direction international conference on acoustics speech and signal processing. ,vol. 1, pp. 533- 536 ,(1999) , 10.1109/ICASSP.1999.758180
S. J. Young, J. J. Odell, P. C. Woodland, Tree-based state tying for high accuracy acoustic modelling Proceedings of the workshop on Human Language Technology - HLT '94. pp. 307- 312 ,(1994) , 10.3115/1075812.1075885
S. J. Young, Tree-based State Tying for High Accuracy Acoustic Modeling international conference on acoustics, speech, and signal processing. pp. 307- 312 ,(1994)