Significance of segmentation in phoneme based Tamil speech recognition system

作者: S. Harish , P. Vijayalakshmi , T. Nagarajan

DOI: 10.1109/ICECTECH.2011.5941739

关键词: Audio miningAcoustic modelSpeech corpusSpeech segmentationSpeech processingSpeech recognitionSpeech analyticsVoice activity detectionArtificial intelligenceComputer scienceNatural language processingSpeech synthesis

摘要: Over the last few decades speech recognition has evolved and matured enough to be used in commercial applications. The applications include automatic dictation software, voice dialling, controlled navigation simple data entry. Automatic Speech Recognition (ASR) deals with conversion of acoustic signals an utterance into text. In this work system for Tamil language is developed. requires segmentation waveform fundamental units. Word natural unit speech. However, each word trained individually there cannot any sharing parameters among words. Hence, it essential have a very large training set so that all words vocabulary are adequately trained. Also problem memory requirement which grows linearly number preferred overcome constraint phone unit. It less models they well For current work, units such as monophones triphones considered. This highlights importance segmented speech, model co-articulation effect influences production. Triphone considers effect. Monophone triphone based systems developed their performance shows above mentioned parameters.

参考文章(5)
Douglas D. O'Shaughnessy, P. Vijayalakshmi, T. Nagarajan, Combining multiple-sized sub-word units in a speech recognition system using baseform selection conference of the international speech communication association. ,(2006)
Steve Young, Gunnar Evermann, Mark Gales, Thomas Hain, Dan Kershaw, Xunying Liu, Gareth Moore, Julian Odell, Dave Ollason, Dan Povey, Valtcho Valtchev, Phil Woodland, The HTK book Cambridge University Engineering Department and Entrophic Cambridge Research Laboratory. ,(1995)
Lawrence Rabiner, Biing-Hwang Juang, Fundamentals of speech recognition ,(1993)
Lawrence R. Rabiner, Ronald W. Schafer, Digital Processing of Speech Signals ,(1978)
L.R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition Proceedings of the IEEE. ,vol. 77, pp. 267- 296 ,(1989) , 10.1109/5.18626