Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)

作者: Francis Bach , Eric Moulines

DOI:

关键词: Mathematical optimizationMathematicsConvex functionStochastic gradient descentRate of convergenceFunction (mathematics)Supervised learningConvexityQuadratic equationStochastic approximation

摘要: We consider the stochastic approximation problem where a convex function has to be minimized, given only knowledge of unbiased estimates its gradients at certain points, framework which includes machine learning methods based on minimization empirical risk. focus problems without strong convexity, for all previously known algorithms achieve convergence rate values O(1/√n) after n iterations. and analyze two that O(1/n) classical supervised problems. For least-squares regression, we show averaged gradient descent with constant step-size achieves desired rate. logistic this is achieved by simple novel algorithm (a) constructs successive local quadratic approximations loss functions, while (b) preserving same running-time complexity as descent. these algorithms, provide non-asymptotic analysis generalization error (in expectation, also in high probability least-squares), run extensive experiments showing they often outperform existing approaches.

参考文章(34)
Angelia Nedić, Dimitri Bertsekas, Convergence Rate of Incremental Subgradient Algorithms Springer, Boston, MA. pp. 223- 264 ,(2001) , 10.1007/978-1-4757-6594-6_11
Alexandre B. Tsybakov, Optimal Rates of Aggregation Learning Theory and Kernel Machines. pp. 303- 313 ,(2003) , 10.1007/978-3-540-45167-9_23
Yurii Nesterov, Arkadii Nemirovskii, Interior-Point Polynomial Algorithms in Convex Programming ,(1987)
Nicolas Le Roux, Francis Bach, Mark Schmidt, Minimizing Finite Sums with the Stochastic Average Gradient arXiv: Optimization and Control. ,(2013)
A. Nemirovski, A. Juditsky, G. Lan, A. Shapiro, Robust Stochastic Approximation Approach to Stochastic Programming SIAM Journal on Optimization. ,vol. 19, pp. 1574- 1609 ,(2009) , 10.1137/070704277
Herbert Robbins, Sutton Monro, A Stochastic Approximation Method Annals of Mathematical Statistics. ,vol. 22, pp. 400- 407 ,(1951) , 10.1214/AOMS/1177729586