A comparative study of continuous speech recognition using neural networks and hidden Markov models

作者： S. Renals , D. McKelvie , F. McInnes

DOI: 10.1109/ICASSP.1991.150353

关键词:

摘要: The recognition performances of two front ends are compared for continuous speech tasks. First, a neural network model (NNM) end was used, with frame labeling performed by radial basis function and segmentation Viterbi algorithm. second discrete hidden Markov (HMM), featuring explicit state duration probability distributions. Two experiments were performed. first used speaker-dependent database, lexicon 571 words. Using low-perplexity grammar, the NNM produced word accuracy 94% sentence 86%. This slightly inferior to HMM end, which accuracies 96% 88%. Without 58% 49% (HMM) recorded. set MIT portion TIMIT database (415 speakers 2072 sentences in total). Results poor both ends, producing marginally better results. >

参考文章(10)

Fergus R. McInnes, Alan Wrench, Yasuo Ariki, Enhancement and optimisation of a speech recognition front end based on hidden Markov models. conference of the international speech communication association. pp. 2461- 2464 ,(1989)

D. Lowe, On Networks, Optimised Feature Extraction and the Bayes Decision ,(1989)

David Lowe, David S. Broomhead, Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks Complex Systems. ,vol. 2, pp. 321- 355 ,(1988)

L. Bahl, P. Brown, P. de Souza, R. Mercer, Maximum mutual information estimation of hidden Markov model parameters for speech recognition international conference on acoustics, speech, and signal processing. ,vol. 11, pp. 49- 52 ,(1986) , 10.1109/ICASSP.1986.1169179

S. Renals, R. Rohwer, Learning phoneme recognition using neural networks international conference on acoustics, speech, and signal processing. pp. 413- 416 ,(1989) , 10.1109/ICASSP.1989.266453

F. Fallside, H. Lucke, T.P. Marsland, P.J. O'Shea, M.S.J. Owen, R.W. Prager, A.J. Robinson, N.H. Russell, Continuous speech recognition for the TIMIT database using neural networks International Conference on Acoustics, Speech, and Signal Processing. pp. 445- 448 ,(1990) , 10.1109/ICASSP.1990.115745

H. Bourlard, C.J. Wellekens, Speech pattern discrimination and multilayer perceptrons Computer Speech & Language. ,vol. 3, pp. 1- 19 ,(1989) , 10.1016/0885-2308(89)90011-9

William M. Fisher, Victor Zue, Jared Bernstein, David S. Pallett, An acoustic‐phonetic data base The Journal of the Acoustical Society of America. ,vol. 81, pp. S92- S93 ,(1987) , 10.1121/1.2034854

John S. Bridle, Alpha-nets: a recurrent “neural” network architecture with a hidden Markov model interpretation Speech Communication. ,vol. 9, pp. 83- 92 ,(1990) , 10.1016/0167-6393(90)90049-F

10.

T. Kohonen, The 'neural' phonetic typewriter Computer. ,vol. 21, pp. 11- 22 ,(1988) , 10.1109/2.28

A comparative study of continuous speech recognition using neural networks and hidden Markov models

来源期刊

我的账户

A comparative study of continuous speech recognition using neural networks and hidden Markov models

来源期刊

相似文章 10

我的账户