Learned-norm pooling for deep feedforward and recurrent neural networks

作者: Caglar Gulcehre , Kyunghyun Cho , Razvan Pascanu , Yoshua Bengio

DOI: 10.1007/978-3-662-44848-9_34

关键词: Norm (mathematics)Activation functionConvolutional neural networkRecurrent neural networkMultilayer perceptronPerceptronDeep learningPoolingComputer scienceArtificial intelligenceAlgorithm

摘要: In this paper we propose and investigate a novel nonlinear unit, called L p for deep neural networks. The proposed unit receives signals from several projections of subset units in the layer below computes normalized norm. We notice two interesting interpretations unit. First, can be understood as generalization number conventional pooling operators such average, root-mean-square max widely used in, instance, convolutional networks (CNN), HMAX models neocognitrons. Furthermore, is, to certain degree, similar recently maxout [13] which achieved state-of-the-art object recognition results on benchmark datasets. Secondly, provide geometrical interpretation activation function based argue that is more efficient at representing complex, separating boundaries. Each defines superelliptic boundary, with its exact shape defined by order p. claim makes it possible model arbitrarily shaped, curved boundaries efficiently combining few different orders. This insight justifies need learning orders each model. empirically evaluate datasets show multilayer perceptrons (MLP) consisting achieve recurrent (RNN).

参考文章(37)
Matthew D. Zeiler, ADADELTA: An Adaptive Learning Rate Method arXiv: Learning. ,(2012)
Yoshua Bengio, Aaron Courville, Deep Learning of Representations international conference on neural information processing. pp. 1- 28 ,(2013) , 10.1007/978-3-642-36657-4_1
David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams, Learning representations by back-propagating errors Nature. ,vol. 323, pp. 696- 699 ,(1988) , 10.1038/323533A0
Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian Goodfellow, Arnaud Bergeron, Nicolas Bouchard, David Warde-Farley, Yoshua Bengio, None, Theano: new features and speed improvements arXiv: Symbolic Computation. ,(2012)
Geoffrey E. Hinton, Vinod Nair, Rectified Linear Units Improve Restricted Boltzmann Machines international conference on machine learning. pp. 807- 814 ,(2010)
Çağlar Gülçehre, Yoshua Bengio, Knowledge Matters: Importance of Prior Information for Optimization arXiv: Learning. ,(2013)
Yoshua Bengio, Tomas Mikolov, Razvan Pascanu, On the difficulty of training recurrent neural networks international conference on machine learning. pp. 1310- 1318 ,(2013)
Yoshua Bengio, Razvan Pascanu, Revisiting Natural Gradient for Deep Networks arXiv: Learning. ,(2013)
Ian J Goodfellow, David Warde-Farley, Pascal Lamblin, Vincent Dumoulin, Mehdi Mirza, Razvan Pascanu, James Bergstra, Frédéric Bastien, Yoshua Bengio, None, Pylearn2: a machine learning research library arXiv: Machine Learning. ,(2013)