Learning deep hierarchical and temporal recurrent neural networks with residual learning

作者: Tehseen Zia , Assad Abbas , Usman Habib , Muhammad Sajid Khan

DOI: 10.1007/S13042-020-01063-0

关键词:

摘要: Learning both hierarchical and temporal dependencies can be crucial for recurrent neural networks (RNNs) to deeply understand sequences. To this end, a unified RNN framework is required that ease the learning of deep structures by allowing gradients propagate back from ends without being vanished. The residual (RL) has appeared as an effective less-costly method facilitate backward propagation gradients. significance RL exclusively shown representations dependencies. Nevertheless, there lack efforts unify these finding into single RNNs. In study, we aim prove approximating identity mapping optimizing structures. We propose called RNNs, learn RNNs mappings across validate proposed method, explore efficacy employing shortcut connections training sequence problems. Experiments are performed on Penn Treebank, Hutter Prize IAM-OnDB datasets results demonstrate utility in terms accuracy computational complexity. even large exploiting parameters increasing network depth gain benefits with reduced size "state".

参考文章(36)
Tomas Mikolov, Martin Karafiát, Sanjeev Khudanpur, Jan Cernocký, Lukás Burget, Recurrent neural network based language model conference of the international speech communication association. pp. 1045- 1048 ,(2010)
Yoshua Bengio, Xavier Glorot, Understanding the difficulty of training deep feedforward neural networks international conference on artificial intelligence and statistics. pp. 249- 256 ,(2010)
Charles Elkan, Zachary C. Lipton, John Berkowitz, A Critical Review of Recurrent Neural Networks for Sequence Learning arXiv: Learning. ,(2015)
Mitch Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, None, Building a large annotated corpus of English: the penn treebank Computational Linguistics. ,vol. 19, pp. 313- 330 ,(1993) , 10.21236/ADA273556
Klaus Greff, Rupesh K. Srivastava, Jan Koutnik, Bas R. Steunebrink, Jurgen Schmidhuber, LSTM: A Search Space Odyssey IEEE Transactions on Neural Networks. ,vol. 28, pp. 2222- 2232 ,(2017) , 10.1109/TNNLS.2016.2582924
Alex Graves, Generating Sequences With Recurrent Neural Networks arXiv: Neural and Evolutionary Computing. ,(2013)
Alan Yuille, Zhiheng Huang, Junhua Mao, Junhua Mao, Yi Yang, Jiang Wang, Wei Xu, Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN) arXiv: Computer Vision and Pattern Recognition. ,(2014)
Ivo Danihelka, Daan Wierstra, Alex Graves, Danilo Jimenez Rezende, Karol Gregor, DRAW: A Recurrent Neural Network For Image Generation arXiv: Computer Vision and Pattern Recognition. ,(2015)
Caglar Gulcehre, Yoshua Bengio, Razvan Pascanu, Kyunghyun Cho, How to Construct Deep Recurrent Neural Networks arXiv: Neural and Evolutionary Computing. ,(2013)
Çaglar Gülçehre, Yoshua Bengio, Yoshua Bengio, Yoshua Bengio, KyungHyun Cho, Junyoung Chung, Empirical evaluation of gated recurrent neural networks on sequence modeling arXiv: Neural and Evolutionary Computing. ,(2014)