Local minima in training of neural networks

作者: Razvan Pascanu , Grzegorz Swirszcz , Wojciech Marian Czarnecki

DOI:

关键词: Value (mathematics)Artificial neural networkNonlinear systemInitializationMathematicsMaxima and minimaWeight spaceTraining (civil)Mathematical optimizationError surface

摘要: There has been a lot of recent interest in trying to characterize the error surface of deep models. This stems from a long standing question. Given that deep networks are highly …

参考文章(24)
Ilya Sutskever, Geoffrey Hinton, James Martens, George Dahl, On the importance of initialization and momentum in deep learning international conference on machine learning. pp. 1139- 1147 ,(2013)
Diederik P. Kingma, Jimmy Ba, Adam: A Method for Stochastic Optimization arXiv: Learning. ,(2014)
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification international conference on computer vision. pp. 1026- 1034 ,(2015) , 10.1109/ICCV.2015.123
Gerard Ben Arous, V. Ugur Guney, Yann LeCun, Levent Sagun, Explorations on high dimensional landscapes arXiv: Machine Learning. ,(2014)
Gérard Ben Arous, Anna Choromanska, Yann LeCun, Mikael Henaff, Michael Mathieu, The Loss Surfaces of Multilayer Networks international conference on artificial intelligence and statistics. ,vol. 38, pp. 192- 204 ,(2015)
Christian Szegedy, Ian J. Goodfellow, Jonathon Shlens, Explaining and Harnessing Adversarial Examples arXiv: Machine Learning. ,(2014)
Jürgen Schmidhuber, Deep learning in neural networks Neural Networks. ,vol. 61, pp. 85- 117 ,(2015) , 10.1016/J.NEUNET.2014.09.003
Yan V. Fyodorov, Ian Williams, Replica Symmetry Breaking Condition Exposed by Random Matrix Calculation of Landscape Complexity Journal of Statistical Physics. ,vol. 129, pp. 1081- 1116 ,(2007) , 10.1007/S10955-007-9386-X
Andrew M. Saxe, James L. McClelland, Surya Ganguli, Exact solutions to the nonlinear dynamics of learning in deep linear neural networks arXiv: Neural and Evolutionary Computing. ,(2013)