Towards speaker-independent continuous speech recognition

作者: Kai-Fu Lee

DOI: 10.1007/978-3-642-83476-9_36

关键词:

摘要: Speaker-independent continuous speech recognition is an extremely difficult task. In this paper, we analyze the nature of its difficulty. Moreover, propose a new approach to speaker-independent through use hidden Markov models, context-dependent phonetic units, perceptually motivated parameters, and two speaker-adaptation algorithms. Finally, present some preliminary results, outline future plans.

参考文章(10)
R. Schwartz, Yen-Lu Chow, F. Kubala, Rapid speaker adaptation using a probabilistic spectral mapping international conference on acoustics, speech, and signal processing. ,vol. 12, pp. 633- 636 ,(1987) , 10.1109/ICASSP.1987.1169575
R. Schwartz, Y. Chow, O. Kimball, S. Roucos, M. Krasner, J. Makhoul, Context-dependent modeling for acoustic-phonetic recognition of continuous speech international conference on acoustics, speech, and signal processing. ,vol. 10, pp. 1205- 1208 ,(1985) , 10.1109/ICASSP.1985.1168283
V. Gupta, M. Lennig, P. Mermelstein, Integration of acoustic information in a large vocabulary word recognizer international conference on acoustics, speech, and signal processing. ,vol. 12, pp. 697- 700 ,(1987) , 10.1109/ICASSP.1987.1169578
Lalit R. Bahl, Frederick Jelinek, Robert L. Mercer, A Maximum Likelihood Approach to Continuous Speech Recognition IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. PAMI-5, pp. 179- 190 ,(1983) , 10.1109/TPAMI.1983.4767370
Frederick Jelinek, Continuous speech recognition by statistical methods Proceedings of the IEEE. ,vol. 64, pp. 532- 556 ,(1976) , 10.1109/PROC.1976.10159
S. Furui, Speaker-independent isolated word recognition using dynamic features of speech spectrum IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 34, pp. 52- 59 ,(1986) , 10.1109/TASSP.1986.1164788
K. Shikano, Kai-Fu Lee, R. Reddy, Speaker adaptation through vector quantization international conference on acoustics, speech, and signal processing. ,vol. 11, pp. 2643- 2646 ,(1986) , 10.1109/ICASSP.1986.1168676
Y Chow, M Dunham, Owen Kimball, M Krasner, G Kubala, John Makhoul, P Price, S Roucos, R Schwartz, None, BYBLOS: The BBN continuous speech recognition system international conference on acoustics, speech, and signal processing. ,vol. 12, pp. 596- 599 ,(1987) , 10.1109/ICASSP.1987.1169748
J. Baker, The DRAGON system--An overview IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 23, pp. 24- 29 ,(1975) , 10.1109/TASSP.1975.1162650