Variational inference and learning for segmental switching state space models of hidden speech dynamics

作者： L.J. Lee , H. Attias , Li Deng

DOI: 10.1109/ICASSP.2003.1198920

关键词: Speech enhancement 、 Machine learning 、 Natural language 、 Computer science 、 Inference 、 Bayesian network 、 Speech processing 、 Artificial intelligence 、 Speech production 、 Hidden Markov model 、 State space

摘要: This paper describes novel and powerful variational EM algorithms for the segmental switching state space models used in speech applications, which are capable of capturing key internal (or hidden) dynamics natural production. Hidden dynamic (HDMs) have recently become a class promising acoustic to incorporate crucial speech-specific knowledge overcome many inherent weaknesses traditional HMMs. However, lack efficient statistical learning is one main obstacles preventing them from being well studied widely used. Since exact inference intractable, approach taken develop effective approximate algorithms. We implemented constraint modeling present recovering hidden discrete units data only. The effectiveness developed verified by experiments on simulation Switchboard data.

参考文章(20)

Raimo Bakis, Jing Huang, Bing Xiang, Yuqing Gao, Multistage coarticulation model combining articulatory, formant and cepstral features. conference of the international speech communication association. pp. 25- 28 ,(2000)

Michael Tipping, Relevance vector machine ,(2000)

Li Jiang, Xuedong Huang, Confidence measures using sub-word-dependent weighting of sub-word confidence scores for robust speech recognition ,(1999)

John E. Hogden, Speech processing using maximum likelihood continuity mapping ASAJ. ,vol. 108, pp. 2709- ,(1998)

Phillippe Jeanrenaud, Kenney Ng, Herbert Gish, Jan R. Rohlicek, John W. McDonough, Topic discriminator using posterior probability or confidence scores ,(1994)

Jan Kleindienst, Ganesh N. Ramaswamy, Adaptive command predictor and method for a natural language dialog system ,(1999)

Ayako Minematsu, Speech recognition apparatus and method ,(1998)

K. Reinhard, M. Niranjan, Diphone subspace mixture trajectory models for HMM complementation Speech Communication. ,vol. 38, pp. 237- 265 ,(2002) , 10.1016/S0167-6393(01)00054-1

Jeff Z. Ma, Li Deng, A mixture linear model with target-directed dynamics for spontaneous speech recognition IEEE International Conference on Acoustics Speech and Signal Processing. ,vol. 1, pp. 961- 964 ,(2002) , 10.1109/ICASSP.2002.5743953

10.

Hsaio-Wuen Hon, Kuansan Wang, Speech recognition method and apparatus utilizing multi-unit models Journal of the Acoustical Society of America. ,vol. 115, pp. 959- 959 ,(2000) , 10.1121/1.1697777

Variational inference and learning for segmental switching state space models of hidden speech dynamics

来源期刊

我的账户

Variational inference and learning for segmental switching state space models of hidden speech dynamics

来源期刊

相似文章 10

我的账户