Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

作者： Takéhiko Nakama

DOI: 10.1016/J.NEUCOM.2009.05.017

关键词:

摘要: In this study, we theoretically analyze two essential training schemes for gradient descent learning in neural networks: batch and on-line training. The convergence properties of the applied to quadratic loss functions are analytically investigated. We quantify each scheme optimal weight using absolute value expected difference (Measure 1) squared 2) between computed by scheme. Although has several advantages over with respect first measure, it does not converge second measure if variance per-instance remains constant. However, decays exponentially, then converges Measure 2. Our analysis reveals exact degrees which set size, gradient, rate affect

参考文章(24)

JC Pricipe, NR Euliano, WC Lefebvre, Neural and adaptive systems ,(2000)

Yann Lecun, S. Becker, Improving the convergence of back-propagation learning with second-order methods Morgan Kaufmann. pp. 29- 37 ,(1989)

Robert J. Marks, Russell D. Reed, Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks ,(1999)

Laurene V. Fausett, Fundamentals of neural networks ,(1993)

Peter Bartlett, Martin M. Anthony, Learning in Neural Networks: Theoretical Foundations Cambridge University Press. ,(1999)

Sholom M. Weiss, Computer systems that learn ,(1990)

Martin Anthony, Peter L Bartlett, Peter L Bartlett, Neural Network Learning: Theoretical Foundations ,(1999)

Christopher M. Bishop, Neural networks for pattern recognition ,(1995)

Yoshua Bengio, Neural networks for speech and sequence recognition ,(1996)

10.

John N. Tsitsiklis, Dimitri P. Bertsekas, Neuro-dynamic programming ,(1996)

Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

来源期刊

我的账户

Theoretical analysis of batch and on-line training for gradient descent learning in neural networks

来源期刊

相似文章 10

我的账户