A hybrid syllable recognition system based on vowel spotting

作者: John Sirigos , Nikos Fakotakis , George Kokkinakis

DOI: 10.1016/S0167-6393(02)00012-2

关键词:

摘要: In this paper we present a hybrid ANN/HMM syllable recognition system based on vowel spotting. Using an advanced multilevel vowel-spotting module track all phonemes in speech signals from where model the segments located between two successive vowels which are defined as syllables. order to achieve minimum losses and accurate detection, focus taking special care of spotter is three different techniques: discrete hidden Markov models (DHMMs), multilayer perceptrons heuristic rules.To set up segments, DHMMs with multiple codebooks used. The usual DHMM probability parameters replaced by combined neural network outputs. For purpose, use both context dependent independent networks.The was tested TIMIT NTIMIT databases results obtained showed 75.09% 59.30% average accuracy, respectively. It has be noted that above no grammars or syllable-based lexicons were

参考文章(26)
John S. D. Mason, Simon Downey, Rhys James Jones, Continuous speech recognition using syllables. conference of the international speech communication association. ,(1997)
Mari Ostendorf, Michiel Bacchiani, Using automatically-derived acoustic sub-word units in large vocabulary speech recognition. conference of the international speech communication association. ,(1998)
George Kokkinakis, John Sirigos, Nikos Fakotakis, A comparison of several speech parameters for speaker independent speech recognition and speaker recognition. conference of the international speech communication association. ,(1995)
Hervé Bourlard, Connectionist speech recognition ,(1993)
J. Sirigos, N. Fakotakis, G. Kokkinakis, A high-performance vowel spotting system based on a multistage architecture european signal processing conference. pp. 1- 4 ,(1998) , 10.5281/ZENODO.36502
Herve A. Bourlard, Nelson Morgan, Connectionist Speech Recognition: A Hybrid Approach Kluwer Academic Publishers. ,(1993)
David B. Roe, Jay G. Wilpon, Voice communication between humans and machines National Academy Press. ,(1994)
Zhihong Hu, J. Schalkwyk, E. Barnard, R. Cole, Speech recognition using syllable-like units international conference on spoken language processing. ,vol. 2, pp. 1117- 1120 ,(1996) , 10.1109/ICSLP.1996.607802
L. Bahl, P. Brown, P. de Souza, R. Mercer, Maximum mutual information estimation of hidden Markov model parameters for speech recognition international conference on acoustics, speech, and signal processing. ,vol. 11, pp. 49- 52 ,(1986) , 10.1109/ICASSP.1986.1169179
Ho-Jin Yu, Yung-Hwan Oh, A neural network using acoustic sub-word units for continuous speech recognition international conference on spoken language processing. ,vol. 1, pp. 506- 509 ,(1996) , 10.1109/ICSLP.1996.607165