作者: Francis Bach , Eric Moulines
DOI:
关键词: Mathematical optimization 、 Mathematics 、 Convex function 、 Stochastic gradient descent 、 Rate of convergence 、 Function (mathematics) 、 Supervised learning 、 Convexity 、 Quadratic equation 、 Stochastic approximation
摘要: We consider the stochastic approximation problem where a convex function has to be minimized, given only knowledge of unbiased estimates its gradients at certain points, framework which includes machine learning methods based on minimization empirical risk. focus problems without strong convexity, for all previously known algorithms achieve convergence rate values O(1/√n) after n iterations. and analyze two that O(1/n) classical supervised problems. For least-squares regression, we show averaged gradient descent with constant step-size achieves desired rate. logistic this is achieved by simple novel algorithm (a) constructs successive local quadratic approximations loss functions, while (b) preserving same running-time complexity as descent. these algorithms, provide non-asymptotic analysis generalization error (in expectation, also in high probability least-squares), run extensive experiments showing they often outperform existing approaches.