Provable Accelerated Gradient Method for Nonconvex Low Rank Optimization.

作者: Zhouchen Lin , Huan Li

DOI:

关键词:

摘要: Optimization with a low rank constraint has broad applications. For large scale problems, an attractive heuristic is to factorize the matrix product of two much smaller matrices. In this paper, we study nonconvex problem $\min_{U\in\mathbf R^{n\times r}} g(U)=f(UU^T)$ under assumptions that $f(X)$ restricted $\mu$-strongly convex and $L$-smooth on set $\{X:X\succeq 0,\mbox{rank}(X)\leq r\}$. We propose accelerated gradient method provable convergence rate $g(U^{N+1})-g(U^*)\leq \left(1-O\left(\sqrt{\frac{\mu}{nL}}\right)\right)^N\|U^0-U^*\|_F^2$, where $U^*$ can be any local minimum algorithm converges to. The improved $\left(1-O\left(\sqrt{\frac{\mu}{L}}\right)\right)^N$ practical assumption entries $U^{k}$ converge nearly equally fast those $U^*$. To best our knowledge, first work establishing linear optimal dependence condition number $L/\mu$ for kind problems. Our also transferred asymmetric factorization $X=\widetilde U\widetilde V^T$.

参考文章(63)
Ilya Sutskever, Geoffrey Hinton, James Martens, George Dahl, On the importance of initialization and momentum in deep learning international conference on machine learning. pp. 1139- 1147 ,(2013)
Wotao Yin, Yangyang Xu, A globally convergent algorithm for nonconvex optimization based on block coordinate update arXiv: Optimization and Control. ,(2014)
Venkatesan Guruswami, Ali Kemal Sinop, Optimal column-based low-rank matrix reconstruction symposium on discrete algorithms. ,vol. 2012, pp. 1207- 1214 ,(2012) , 10.5555/2095116.2095211
Ruoyu Sun, Zhi-Quan Luo, Guaranteed Matrix Completion via Nonconvex Factorization 2015 IEEE 56th Annual Symposium on Foundations of Computer Science. pp. 270- 289 ,(2015) , 10.1109/FOCS.2015.25
Shai Shalev-Shwartz, Tong Zhang, Stochastic dual coordinate ascent methods for regularized loss Journal of Machine Learning Research. ,vol. 14, pp. 567- 599 ,(2013)
Trevor Hastie, Rahul Mazumder, Jason Lee, Reza Zadeh, None, Matrix completion and low-rank SVD via fast alternating least squares Journal of Machine Learning Research. ,vol. 16, pp. 3367- 3402 ,(2015) , 10.5555/2789272.2912106
Alan Edelman, Eigenvalues and condition numbers of random matrices SIAM Journal on Matrix Analysis and Applications. ,vol. 9, pp. 543- 560 ,(1988) , 10.1137/0609045
Saeed Ghadimi, Guanghui Lan, Accelerated gradient methods for nonconvex nonlinear and stochastic programming Mathematical Programming. ,vol. 156, pp. 59- 99 ,(2016) , 10.1007/S10107-015-0871-8
Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, Yong Yu, Yi Ma, Robust Recovery of Subspace Structures by Low-Rank Representation IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 35, pp. 171- 184 ,(2013) , 10.1109/TPAMI.2012.88
Lin Xiao, Tong Zhang, A PROXIMAL STOCHASTIC GRADIENT METHOD WITH PROGRESSIVE VARIANCE REDUCTION Siam Journal on Optimization. ,vol. 24, pp. 2057- 2075 ,(2014) , 10.1137/140961791