On the Comparison between Cyclic Sampling and Random Reshuffling.

作者: Kun Yuan , Wotao Yin , Xianghui Mao , Xinmeng Huang

DOI:

关键词:

摘要: When applying a stochastic/incremental algorithm, one must choose the order to draw samples. Among most popular approaches are cyclic sampling and random reshuffling, which empirically faster more cache-friendly than uniform-iid-sampling. Cyclic draws samples in fixed, order, is less robust reshuffling periodically. Indeed, existing works have established worst case convergence rates for sampling, generally worse that of reshuffling. In this paper, however, we found certain can be much discover it at low cost! Studying comparing different orders typically require new analytic techniques. introduce norm, defined based on measure distance solution. Applying technique proximal Finito/MISO algorithm allows us identify optimal fixed ordering, beat by factor up log(n)/n terms best-known upper bounds. We also propose strategy ordering numerically. The state-of-the-art compared previous works.

参考文章(37)
Pat Langley, Crafting Papers on Machine Learning international conference on machine learning. pp. 1207- 1216 ,(2000)
Peilin Zhao, Peilin Zhao, Peilin Zhao, Tong Zhang, Tong Zhang, Stochastic Optimization with Importance Sampling for Regularized Loss Minimization international conference on machine learning. ,vol. 1, pp. 1- 9 ,(2015)
Herbert Robbins, Sutton Monro, A Stochastic Approximation Method Annals of Mathematical Statistics. ,vol. 22, pp. 400- 407 ,(1951) , 10.1214/AOMS/1177729586
Scott Shaobing Chen, David L. Donoho, Michael A. Saunders, Atomic Decomposition by Basis Pursuit SIAM Review. ,vol. 43, pp. 129- 159 ,(2001) , 10.1137/S003614450037906X
Rie Johnson, Tong Zhang, Accelerating Stochastic Gradient Descent using Predictive Variance Reduction neural information processing systems. ,vol. 26, pp. 315- 323 ,(2013)
Hsiang-Fu Yu, Fang-Lan Huang, Chih-Jen Lin, Dual coordinate descent methods for logistic regression and maximum entropy models Machine Learning. ,vol. 85, pp. 41- 75 ,(2011) , 10.1007/S10994-010-5221-8
Julien Mairal, Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning Siam Journal on Optimization. ,vol. 25, pp. 829- 855 ,(2015) , 10.1137/140957639
tiberio Caetano, Justin Domke, Aaron Defazio, Finito: A faster, permutable incremental gradient method for big data problems international conference on machine learning. pp. 1125- 1133 ,(2014)
Robert Tibshirani, Regression Shrinkage and Selection Via the Lasso Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 58, pp. 267- 288 ,(1996) , 10.1111/J.2517-6161.1996.TB02080.X
Gonzalo Mateos, Juan Andrés Bazerque, Georgios B. Giannakis, Distributed Sparse Linear Regression IEEE Transactions on Signal Processing. ,vol. 58, pp. 5262- 5276 ,(2010) , 10.1109/TSP.2010.2055862