Parallel Asynchronous Stochastic Variance Reduction for Nonconvex Optimization.

作者: Zhouchen Lin , Cong Fang

DOI:

关键词:

摘要:

参考文章(18)
James Kwok, Ruiliang Zhang, Asynchronous Distributed ADMM for Consensus Optimization international conference on machine learning. pp. 1701- 1709 ,(2014)
Barnabás Póczos, Alex Smola, Ahmed Hefny, Sashank J. Reddi, Suvrit Sra, On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants arXiv: Learning. ,(2015)
Fabio Petroni, Leonardo Querzoni, GASGD: stochastic gradient descent for distributed asynchronous matrix completion via graph partitioning. conference on recommender systems. pp. 241- 248 ,(2014) , 10.1145/2645710.2645725
Lin Xiao, Tong Zhang, A PROXIMAL STOCHASTIC GRADIENT METHOD WITH PROGRESSIVE VARIANCE REDUCTION Siam Journal on Optimization. ,vol. 24, pp. 2057- 2075 ,(2014) , 10.1137/140961791
Rie Johnson, Tong Zhang, Accelerating Stochastic Gradient Descent using Predictive Variance Reduction neural information processing systems. ,vol. 26, pp. 315- 323 ,(2013)
Jianchao Yang, Hailin Jin, Zhe Lin, Thomas Paine, Thomas Huang, GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training arXiv: Computer Vision and Pattern Recognition. ,(2013)
Hyokun Yun, Hsiang-Fu Yu, Cho-Jui Hsieh, S. V. N. Vishwanathan, Inderjit Dhillon, NOMAD Proceedings of the VLDB Endowment. ,vol. 7, pp. 975- 986 ,(2014) , 10.14778/2732967.2732973
Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc'aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Quoc Le, Andrew Ng, None, Large Scale Distributed Deep Networks neural information processing systems. ,vol. 25, pp. 1223- 1231 ,(2012)
Zeyuan Allen-Zhu, Elad Hazan, Variance Reduction for Faster Non-Convex Optimization arXiv: Optimization and Control. ,(2016)
Barnabas Poczos, Alex Smola, Ahmed Hefny, Sashank J. Reddi, Suvrit Sra, Stochastic Variance Reduction for Nonconvex Optimization arXiv: Optimization and Control. ,(2016)