作者: S. Takahashi , S. Sagayama
DOI: 10.1109/ICASSP.1995.479643
关键词: Estimation theory 、 Representation (mathematics) 、 Artificial intelligence 、 Hidden Markov model 、 Robustness (computer science) 、 Gaussian process 、 Training set 、 Dimension (vector space) 、 Pattern recognition 、 Context model 、 Multivariate normal distribution 、 Computer science 、 Word recognition
摘要: One of the problems with context-dependent HMMs is that a large number model parameters should be estimated using limited amount training data. Parameters have same property tied in order to represent acoustic models efficiently. This paper proposes four-level tied-structure for phoneme models. The four levels include 1) level, 2) state 3) distribution and 4) feature parameter level. Although some techniques been proposed first three levels, tying fourth level newly this paper. We found makes it possible 1,600 mean vectors multivariate Gaussian mixture by combination 16 representative values each dimension. Experimental results show reduces calculation required recognition without significant degrading performance. Furthermore, we also effective training.