Hierarchical Recurrent Neural Networks for Long-Term Dependencies

作者: Yoshua Bengio , Salah El Hihi

DOI:

关键词:

摘要: … conclusion from the analyses performed on recurrent networks and HMMs to learn to represent … a hierarchical recurrent network with a single-scale fully-connected recurrent network. …

参考文章(22)
Satinder P. Singh, Reinforcement learning with a hierarchy of abstract models national conference on artificial intelligence. pp. 202- 207 ,(1992)
Bill G. Horne, C. Lee Giles, Tsungnan Lin, Peter Tiňo, Learning long-term dependencies is not as difficult with NARX recurrent neural networks University of Maryland at College Park. ,(1995)
Richard S. Sutton, TD Models: Modeling the World at a Mixture of Time Scales Machine Learning Proceedings 1995. pp. 531- 539 ,(1995) , 10.1016/B978-1-55860-377-6.50072-4
Y. Bengio, P. Frasconi, Diffusion of context and credit information in Markovian models Journal of Artificial Intelligence Research. ,vol. 3, pp. 249- 270 ,(1995) , 10.1613/JAIR.233
I. Daubechies, The wavelet transform, time-frequency localization and signal analysis IEEE Transactions on Information Theory. ,vol. 36, pp. 961- 1005 ,(1990) , 10.1109/18.57199
F. Brugnara, R. De Mori, D. Giuliani, M. Omologo, A family of parallel hidden Markov models international conference on acoustics, speech, and signal processing. ,vol. 1, pp. 377- 380 ,(1992) , 10.1109/ICASSP.1992.225893
Jürgen Schmidhuber, Learning complex, extended sequences using the principle of history compression Neural Computation. ,vol. 4, pp. 234- 242 ,(1992) , 10.1162/NECO.1992.4.2.234
Kevin J. Lang, Alex H. Waibel, Geoffrey E. Hinton, A time-delay neural network architecture for isolated word recognition Neural Networks. ,vol. 3, pp. 23- 43 ,(1990) , 10.1016/0893-6080(90)90044-L
P. Frasconi, M. Gori, M. Maggini, G. Soda, Unified integration of explicit knowledge and learning by example in recurrent networks IEEE Transactions on Knowledge and Data Engineering. ,vol. 7, pp. 340- 346 ,(1995) , 10.1109/69.382304