Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems

作者: Colin Raffel , Daniel P. W. Ellis

DOI:

关键词:

摘要: We propose a simplified model of attention which is applicable to feed-forward neural networks and demonstrate that the resulting model can solve the synthetic" addition" and" multiplication" long-term memory problems for sequence lengths which are both longer and more widely varying than the best published results for these tasks.

参考文章(30)
Ilya Sutskever, Geoffrey Hinton, James Martens, George Dahl, On the importance of initialization and momentum in deep learning international conference on machine learning. pp. 1139- 1147 ,(2013)
Eric Battenberg, Brian McFee, Daniel Maturana, Søren Kaae Sønderby, Jan Schlüter, Colin Raffel, Sander Dieleman, Jonas Degrave, CongLiu, Jeffrey De Fauw, peterderivaz, Michael Heilman, Eben Olson, Jack Kelly, diogo, Hendrik Weideman, Jon, Daniel Nouri, instagibbs, Britefury, takacsg, Martin Thoma, Kashif Rasul, Lasagne: First release. ,(2015) , 10.5281/ZENODO.27878
Ilya Sutskever, James Martens, Learning Recurrent Neural Networks with Hessian-Free Optimization international conference on machine learning. pp. 1033- 1040 ,(2011)
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, Yoshua Bengio, None, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention international conference on machine learning. ,vol. 3, pp. 2048- 2057 ,(2015)
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
Søren Kaae Sønderby, Casper Kaae Sønderby, Ole Winther, Henrik Nielsen, Convolutional LSTM Networks for Subcellular Localization of Proteins arXiv: Quantitative Methods. ,(2015) , 10.1007/978-3-319-21233-3_6
Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian Goodfellow, Arnaud Bergeron, Nicolas Bouchard, David Warde-Farley, Yoshua Bengio, None, Theano: new features and speed improvements arXiv: Symbolic Computation. ,(2012)
Shuicheng Yan, Qiang Chen, Min Lin, Network In Network arXiv: Neural and Evolutionary Computing. ,(2013)
Geoffrey E. Hinton, Navdeep Jaitly, Quoc V. Le, A Simple Way to Initialize Recurrent Networks of Rectified Linear Units arXiv: Neural and Evolutionary Computing. ,(2015)
Alex Graves, Generating Sequences With Recurrent Neural Networks arXiv: Neural and Evolutionary Computing. ,(2013)