Understanding speech recognition using correlation-generated neural network targets

作者: Yonghong Yan

DOI: 10.1109/89.759046

关键词:

摘要: Training neural networks with variable targets for speech recognition systems has been shown to be effective in improving word accuracy. In this correspondence, a new and simple method estimating given training pattern is presented. It uses estimated correlations between different output nodes of network create set each pattern. Experimental results show that the error reduced by more than 20% when these correlation-based are compared conventional zero/one squared-error cost function. Performance approaches high-performance hidden Markov model (HMM) recognizers but requires far fewer parameters.

参考文章(16)
Yeshwant K. Muthusamy, Ronald A. Cole, Beatrice T. Oshika, The OGI multi-language telephone speech corpus. conference of the international speech communication association. ,(1992)
Mike Noel, Terri Lander, Ronald A. Cole, T. Durham, New telephone speech corpora at CSLU. conference of the international speech communication association. ,(1995)
A.-M. Derouault, Context-dependent phonetic Markov models for large vocabulary speech recognition international conference on acoustics, speech, and signal processing. ,vol. 12, pp. 360- 363 ,(1987) , 10.1109/ICASSP.1987.1169604
J.F. Pitrelli, C. Fong, S.H. Wong, J.R. Spitz, H.C. Leung, PhoneBook: a phonetically-rich isolated-word telephone-speech database international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 101- 104 ,(1995) , 10.1109/ICASSP.1995.479283
Etienne Barnard, Ronald Cole, Mark Fanty, Pieter J. E. Vermeulen, Real-world speech recognition with neural networks Applications and Science of Artificial Neural Networks. ,vol. 2492, pp. 524- 537 ,(1995) , 10.1117/12.205157
Michael D. Richard, Richard P. Lippmann, Neural Network Classifiers Estimate Bayesian a posteriori Probabilities. Neural Computation. ,vol. 3, pp. 461- 483 ,(1991) , 10.1162/NECO.1991.3.4.461
Anthony J. Robinson, Andrew W. Senior, Forward-backward retraining of recurrent neural networks neural information processing systems. ,vol. 8, pp. 743- 749 ,(1995)
Yonghong Yen, M. Fanty, R. Cole, Speech recognition using neural networks with forward-backward probability generated targets international conference on acoustics, speech, and signal processing. ,vol. 4, pp. 3241- 3244 ,(1997) , 10.1109/ICASSP.1997.595483
H. Gish, A probabilistic approach to the understanding and training of neural network classifiers International Conference on Acoustics, Speech, and Signal Processing. pp. 1361- 1364 ,(1990) , 10.1109/ICASSP.1990.115636
Andrew R Barron, None, Statistical properties of artificial neural networks conference on decision and control. pp. 280- 285 ,(1989) , 10.1109/CDC.1989.70117