Stochastic Variance Reduced Gradient Methods Using a Trust-Region-Like Scheme

作者： Yu-Hong Dai , Xin-Wei Liu , Tengteng Yu , Jie Sun , Jie Sun

关键词:

摘要: Stochastic variance reduced gradient (SVRG) methods are important approaches to minimize the average of a large number cost functions frequently arising in machine learning and many other applications. In this paper, based on SVRG, we propose SVRG-TR method which employs trust-region-like scheme for selecting stepsizes. It is proved that linearly convergent expectation smooth strongly convex enjoys faster convergence rate than SVRG methods. order overcome difficulty tuning stepsizes by hand, combine Barzilai–Borwein (BB) automatically compute method, named as SVRG-TR-BB method. By incorporating mini-batching with SVRG-TR-BB, respectively, further two extended mSVRG-TR mSVRG-TR-BB. Linear complexity given. Numerical experiments some standard datasets show generally better or comparable best-tuned modern stochastic methods, while mSVRG-TR-BB very competitive mini-batch variants recent successful

参考文章(45)

Ilya Sutskever, Geoffrey Hinton, James Martens, George Dahl, On the importance of initialization and momentum in deep learning international conference on machine learning. pp. 1139- 1147 ,(2013)

Léon Bottou, Large-Scale Machine Learning with Stochastic Gradient Descent Proceedings of COMPSTAT'2010. pp. 177- 186 ,(2010) , 10.1007/978-3-7908-2604-3_16

Shai Shalev-Shwartz, Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms ,(2015)

Peilin Zhao, Peilin Zhao, Peilin Zhao, Tong Zhang, Tong Zhang, Stochastic Optimization with Importance Sampling for Regularized Loss Minimization international conference on machine learning. ,vol. 1, pp. 1- 9 ,(2015)

Sham M. Kakade, Roy Frostig, Aaron Sidford, Rong Ge, Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization arXiv: Machine Learning. ,(2015)

J.-F. Cardoso, B.H. Laheld, Equivariant adaptive source separation IEEE Transactions on Signal Processing. ,vol. 44, pp. 3017- 3030 ,(1996) , 10.1109/78.553476

Herbert Robbins, Sutton Monro, A Stochastic Approximation Method Annals of Mathematical Statistics. ,vol. 22, pp. 400- 407 ,(1951) , 10.1214/AOMS/1177729586

Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro, Online dictionary learning for sparse coding Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09. pp. 689- 696 ,(2009) , 10.1145/1553374.1553463

Zaiwen Wen, Wotao Yin, Donald Goldfarb, Yin Zhang, A Fast Algorithm for Sparse Reconstruction Based on Shrinkage, Subspace Optimization, and Continuation SIAM Journal on Scientific Computing. ,vol. 32, pp. 1832- 1857 ,(2010) , 10.1137/090747695

10.

Yakui Huang, Hongwei Liu, Sha Zhou, A Barzilai---Borwein type method for stochastic linear complementarity problems Numerical Algorithms. ,vol. 67, pp. 477- 489 ,(2014) , 10.1007/S11075-013-9803-Y

Stochastic Variance Reduced Gradient Methods Using a Trust-Region-Like Scheme

来源期刊

我的账户

Stochastic Variance Reduced Gradient Methods Using a Trust-Region-Like Scheme

来源期刊

相似文章 1

Variable metric proximal stochastic variance reduced gradient methods for nonconvex nonsmooth optimization

我的账户