Probabilistic-trajectory segmental HMMs

作者: W.J. Holmes , M.J. Russell

DOI: 10.1006/CSLA.1998.0048

关键词:

摘要: Abstract “Segmental hidden Markov models” (SHMMs) are intended to overcome important speech-modelling limitations of the conventional-HMM approach by representing sequences (or segments) features and incorporating concept trajectories describe how change over time. A novel feature presented in this paper is thatextra-segmentalvariability between different examples a sub-phonemic speech segment modelled separately fromintra-segmentalvariability within any one example. The extra-segmental component model represented terms variability trajectory parameters, these models therefore referred as “probabilistic-trajectory segmental HMMs” (PTSHMMs). This presents theory PTSHMMs using linear description characterized slope mid-point theoretical experimental comparisons types PTSHMMs, simpler SHMMs conventional HMMs. Experiments have demonstrated that, for given set, PTSHMM can substantially reduce error rate comparison with HMM, both connected-digit recognition task phonetic classification task. Performance benefits been from additionally modelling parameter.

参考文章(31)
I. Lee Hetherington, Victor W. Zue, Hong C. Leung, Speech recognition using stochastic explicit-segment modeling. conference of the international speech communication association. ,(1991)
Philip N. Garner, John N. Holmes, Wendy J. Holmes, Using formant frequencies in speech recognition. conference of the international speech communication association. ,(1997)
Peter F. Brown, The acoustic-modeling problem in automatic speech recognition Interim Report Carnegie-Mellon Univ. ,(1987) , 10.21236/ADA188529
C. Wellekens, Explicit time correlation in hidden Markov models for speech recognition international conference on acoustics, speech, and signal processing. ,vol. 12, pp. 384- 386 ,(1987) , 10.1109/ICASSP.1987.1169614
J. Glass, J. Chang, M. McCandless, A probabilistic framework for feature-based speech recognition international conference on spoken language processing. ,vol. 4, pp. 2277- 2280 ,(1996) , 10.1109/ICSLP.1996.607261
L. Liporace, Maximum likelihood estimation for multivariate observations of Markov sources IEEE Transactions on Information Theory. ,vol. 28, pp. 729- 734 ,(1982) , 10.1109/TIT.1982.1056544
Hervé Bourlard, Hynek Hermansky, Nelson Morgan, Towards increasing speech recognition error rates Speech Communication. ,vol. 18, pp. 205- 231 ,(1996) , 10.1016/0167-6393(96)00003-9
Oded Ghitza, M.Mohan Sondhi, Hidden Markov models with templates as non-stationary states: an application to speech recognition Computer Speech & Language. ,vol. 7, pp. 101- 119 ,(1993) , 10.1006/CSLA.1993.1005