Feature-Dependent Allophone Clustering

作者: Shigeki Sagayama , Hiroshi Shimodaira , Shigeki Matsuda , Mitsuru Nakai

DOI:

关键词: Artificial intelligenceFeature vectorAllophoneCluster analysisComputer scienceStructure (mathematical logic)Pattern recognitionSpeech recognitionHidden Markov modelFeature (machine learning)

摘要: We propose a novel method for clustering allophones called Feature-Dependent Allophone Clustering (FD-AC) that determines feature-dependent HMM topology automatically. Existing methods allophone are based on parameter sharing between the models resemble each other in behaviors of feature vector sequences. However, all features sequences may not necessarily have common structures. It is considered can be better modeled by allocating optimal structure to feature. In this paper, we Successive State Splitting (FD-SSS) as an implementation FD-AC. speaker-dependent continuous phoneme recognition experiments, HMMs created FD-SSS reduced error rates about 10% compared with conventional features.

参考文章(7)
S. Takahashi, S. Sagayama, Four-level tied-structure for efficient representation of acoustic modeling international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 520- 523 ,(1995) , 10.1109/ICASSP.1995.479643
S. Sagayama, Asynchronous-Transition HMM for Acoustic Modeling international conference on acoustics, speech, and signal processing. pp. 1001- 1004 ,(2000)
J.R. Bellegarda, D. Nahamoo, Tied mixture continuous parameter models for large vocabulary isolated speech recognition international conference on acoustics, speech, and signal processing. pp. 13- 16 ,(1989) , 10.1109/ICASSP.1989.266351
Shigeki Matsuda, Mitsuru Nakai, Hiroshi Shimodaira, Shigeki Sagayama, Asynchronous-transition HMM international conference on acoustics, speech, and signal processing. ,vol. 2, pp. 1005- 1008 ,(2000) , 10.1109/ICASSP.2000.859132
X.D. Huang, K.F. Lee, H.W. Hon, M.Y. Hwang, Improved acoustic modeling with the SPHINX speech recognition system international conference on acoustics, speech, and signal processing. pp. 345- 348 ,(1991) , 10.1109/ICASSP.1991.150347
J. Takami, S. Sagayama, A successive state splitting algorithm for efficient allophone modeling international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 573- 576 ,(1992) , 10.1109/ICASSP.1992.225855
M. Ostendorf, H. Singer, HMM topology design using maximum likelihood successive state splitting Computer Speech & Language. ,vol. 11, pp. 17- 41 ,(1997) , 10.1006/CSLA.1996.0021