Finding low-rank solutions to smooth convex problems via the Burer-Monteiro approach

作者: Dohyung Park , Anastasios Kyrillidis , Constantine Caramanis , Sujay Sanghavi

DOI: 10.1109/ALLERTON.2016.7852264

关键词: Gradient descentRank (linear algebra)Discrete mathematicsConvex functionLipschitz continuityFunction (mathematics)Matrix (mathematics)MathematicsCombinatoricsMatrix completionSublinear function

摘要: A rank-r matrix X ∈ ℝm×n can be written as a product UV ⊤, where U ℝm×r and V ℝn×r. One could exploit this observation in optimization: e.g., consider the minimization of convex function ƒ(X) over matrices, set matrices is modeled via factorization variables. Such heuristic has been widely used before for problem instances, solution (approximately) low-rank. Though such parameterization reduces number variables more efficient w.r.t. computational memory requirements (of particular interest case r ≪ min{m, n}), it comes at cost: ƒ(UV ⊤) becomes non-convex V. In paper, we study optimizing generic smooth ƒ, that Lipschitz continuous gradients, focus on first-order, gradient descent algorithmic solutions. We propose Bi-Factored Gradient Descent (BFGD) algorithm, an first-order method operates U, factors. show when ƒ BFGD initialized properly, local sublinear convergence to globally optimum point. As test case, 1-bit completion problem: compare with state-of-the-art approaches least competitive error performance real dataset experiments, while being faster execution, compared rest algorithms. conclude work some remarks open questions further investigations.

参考文章(43)
Nicolas Usunier, Samy Bengio, Jason Weston, WSABIE: scaling up to large vocabulary image annotation international joint conference on artificial intelligence. pp. 2764- 2770 ,(2011) , 10.5591/978-1-57735-516-8/IJCAI11-460
Volkan Cevher, Quoc Tran-Dinh, Alp Yurtsever, A universal primal-dual convex optimization framework neural information processing systems. ,vol. 28, pp. 3150- 3158 ,(2015)
Andrew I. Schein, Lawrence K. Saul, Lyle H. Ungar, A Generalized Linear Model for Principal Component Analysis of Binary Data. international conference on artificial intelligence and statistics. ,(2003)
Inderjit S. Dhillon, Prateek Jain, Raghu Meka, Guaranteed Rank Minimization via Singular Value Projection neural information processing systems. ,vol. 23, pp. 937- 945 ,(2010)
Martin Jaggi, Marek Sulovsky, A Simple Algorithm for Nuclear Norm Regularized Problems international conference on machine learning. pp. 471- 478 ,(2010)
Srinadh Bhojanapalli, Sujay Sanghavi, Rachel Ward, Yudong Chen, Coherent Matrix Completion international conference on machine learning. pp. 674- 682 ,(2014)
Sham M. Kakade, Praneeth Netrapalli, Prateek Jain, Chi Jin, Computing Matrix Squareroot via Non Convex Local Search arXiv: Numerical Analysis. ,(2015)
Oleg Kuybeda, Gabriel A. Frank, Alberto Bartesaghi, Mario Borgnia, Sriram Subramaniam, Guillermo Sapiro, A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography Journal of Structural Biology. ,vol. 181, pp. 116- 127 ,(2013) , 10.1016/J.JSB.2012.10.010
Rahul Agrawal, Archit Gupta, Yashoteja Prabhu, Manik Varma, Multi-label learning with millions of labels Proceedings of the 22nd international conference on World Wide Web - WWW '13. pp. 13- 24 ,(2013) , 10.1145/2488388.2488391
Martin J Wainwright, Alekh Agarwal, Sahand Negahban, Fast global convergence rates of gradient methods for high-dimensional statistical recovery neural information processing systems. ,vol. 23, pp. 37- 45 ,(2010)