Modular construction of time-delay neural networks for speech recognition

作者: Alex Waibel

DOI: 10.1162/NECO.1989.1.1.39

关键词: LTI system theoryTime delay neural networkENCODESpeech recognitionModularity (networks)Artificial intelligenceConnectionismProblem of timeNetwork modelArtificial neural networkComputer science

摘要: Several strategies are described that overcome limitations of basic network models as steps towards the design large connectionist speech recognition systems. The two major areas concern problem time and scaling. Speech signals continuously vary over encode transmit enormous amounts human knowledge. To decode these signals, neural networks must be able to use appropriate representations it possible extend nets almost arbitrary sizes complexity within finite resources. is addressed by development a Time-Delay Neural Network; scaling Modularity Incremental Design based on smaller subcomponent nets. It shown small trained perform limited tasks develop invariant, hidden abstractions can subsequently exploited train larger, more complex efficiently. Using techniques, phoneme increasing complexity...

参考文章(11)
Yoshua Bengio, Regis Cardin, Piero Cosi, Renato De Mori, Ettore Merlo, Speech coding with multi-layer networks international conference on acoustics, speech, and signal processing. pp. 164- 167 ,(1989) , 10.1007/978-3-642-76153-9_26
Alan J. Katz, Michael T. Gately, Dean R. Collins, Robust classifiers without robust features Neural Computation. ,vol. 2, pp. 472- 479 ,(1990) , 10.1162/NECO.1990.2.4.472
H.-U. Bauer, T. Geisel, Dynamics of signal processing in feedback multilayer perceptrons 1990 IJCNN International Joint Conference on Neural Networks. pp. 131- 136 ,(1990) , 10.1109/IJCNN.1990.137835
Gori, Bengio, De Mori, BPS: a learning algorithm for capturing the dynamic nature of speech international joint conference on neural network. pp. 417- 423 ,(1989) , 10.1109/IJCNN.1989.118276
A. Shaw, R.A. Mitchell, Phoneme recognition with a time-delay neural network international joint conference on neural network. pp. 191- 195 ,(1990) , 10.1109/IJCNN.1990.137715
Pawel J. Jastreboff, Phantom auditory perception (tinnitus): mechanisms of generation and perception Neuroscience Research. ,vol. 8, pp. 221- 254 ,(1990) , 10.1016/0168-0102(90)90031-9
H.-U. Bauer, T. Geisel, Nonlinear dynamics of feedback multilayer perceptrons Physical Review A. ,vol. 42, pp. 2401- 2409 ,(1990) , 10.1103/PHYSREVA.42.2401
Kevin J. Lang, Alex H. Waibel, Geoffrey E. Hinton, A time-delay neural network architecture for isolated word recognition Neural Networks. ,vol. 3, pp. 23- 43 ,(1990) , 10.1016/0893-6080(90)90044-L
R. Lippmann, An introduction to computing with neural nets IEEE ASSP Magazine. ,vol. 4, pp. 4- 22 ,(1987) , 10.1109/MASSP.1987.1165576
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, K.J. Lang, Phoneme recognition using time-delay neural networks IEEE Transactions on Acoustics, Speech, and Signal Processing. ,vol. 37, pp. 393- 404 ,(1989) , 10.1109/29.21701