Hierarchical phoneme discrimination by hidden Markov modelling using cepstrum and formant information

作者: Y. Ariki , F.R. McInnes , M.A. Jack

DOI: 10.1109/ICASSP.1989.266514

关键词:

摘要: A report is presented of comparative results for vowel classification using hidden Markov models based on linear predictive coding (LPC)-based cepstral vectors and formant features. The accuracy shown to be significantly improved by time duration constraints in feature space, especially the mel-frequency representation its derivative. highest recognition obtained integrating two spaces, multiplying probabilities computed separate spaces. This improvement extended more general phoneme task use a hierarchical integration method, which utilizes space together with consonant LPC-based space. >

参考文章(8)
K.-F. Lee, H.-W. Hon, Large-vocabulary speaker-independent continuous speech recognition using HMM international conference on acoustics speech and signal processing. pp. 123- 126 ,(1988) , 10.1109/ICASSP.1988.196527
A. Averbuch, L. Bahl, R. Bakis, P. Brown, G. Daggett, S. Das, K. Davies, S. De Gennaro, P. de Souza, E. Epstein, D. Fraleigh, F. Jelinek, B. Lewis, R. Mercer, J. Moorhead, A. Nadas, D. Nahamoo, M. Picheny, G. Shichman, P. Spinelli, D. Van Compernolle, H. Wilkens, Experiments with the Tangora 20,000 word speech recognizer international conference on acoustics, speech, and signal processing. ,vol. 12, pp. 701- 704 ,(1987) , 10.1109/ICASSP.1987.1169870
A. Crowe, M.A. Jack, Globally optimising formant tracker using generalised centroids Electronics Letters. ,vol. 23, pp. 1019- 1020 ,(1987) , 10.1049/EL:19870714
Frederick Jelinek, Continuous speech recognition by statistical methods Proceedings of the IEEE. ,vol. 64, pp. 532- 556 ,(1976) , 10.1109/PROC.1976.10159
Y Chow, M Dunham, Owen Kimball, M Krasner, G Kubala, John Makhoul, P Price, S Roucos, R Schwartz, None, BYBLOS: The BBN continuous speech recognition system international conference on acoustics, speech, and signal processing. ,vol. 12, pp. 596- 599 ,(1987) , 10.1109/ICASSP.1987.1169748
Y. Linde, A. Buzo, R. Gray, An Algorithm for Vector Quantizer Design IEEE Transactions on Communications. ,vol. 28, pp. 84- 95 ,(1980) , 10.1109/TCOM.1980.1094577
M. Russell, A. Cook, Experimental evaluation of duration modelling techniques for automatic speech recognition international conference on acoustics, speech, and signal processing. ,vol. 12, pp. 2376- 2379 ,(1987) , 10.1109/ICASSP.1987.1169918