Finito: A faster, permutable incremental gradient method for big data problems

作者: tiberio Caetano , Justin Domke , Aaron Defazio

DOI:

关键词:

摘要: Recent advances in optimization theory have shown that smooth strongly convex finite sums can be minimized faster than by treating them as a black box "batch" problem. In this work we introduce new method class with theoretical convergence rate four times existing methods, for sufficiently many terms. This is also amendable to sampling without replacement scheme practice gives further speed-ups. We give empirical results showing state of the art performance.

参考文章(10)
Nicolas Le Roux, Francis Bach, Mark Schmidt, Minimizing Finite Sums with the Stochastic Average Gradient arXiv: Optimization and Control. ,(2013)
Shai Shalev-Shwartz, Tong Zhang, Stochastic dual coordinate ascent methods for regularized loss Journal of Machine Learning Research. ,vol. 14, pp. 567- 599 ,(2013)
Julien Mairal, Optimization with First-Order Surrogate Functions international conference on machine learning. ,vol. 28, pp. 783- 791 ,(2013)
Julien Mairal, Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning Siam Journal on Optimization. ,vol. 25, pp. 829- 855 ,(2015) , 10.1137/140957639
Martin Takáč, Peter Richtárik, Iteration Complexity of Randomized Block-Coordinate Descent Methods for Minimizing a Composite Function ∗ arXiv: Optimization and Control. ,(2011)
Shai Shalev-Shwartz, Tong Zhang, Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization arXiv: Machine Learning. ,(2012)
Yurii Nesterov, Efficiency of coordinate descent methods on huge-scale optimization problems Research Papers in Economics. ,(2010)