Fused Lasso Screening Rules via the Monotonicity of Subdifferentials

作者: Jie Wang , Wei Fan , Jieping Ye

DOI: 10.1109/TPAMI.2014.2388203

关键词: Affine transformationFeature DimensionMonotonic functionLasso (statistics)Mathematical optimizationSmoothnessFeature extractionMathematicsDual (category theory)Speedup

摘要: Fused Lasso is a popular regression technique that encodes the smoothness of data. It has been applied successfully to many applications with smooth feature structure. However, computational cost existing solvers for fused prohibitive when dimension extremely large. In this paper, we propose novel screening rules are able quickly identity adjacent features same coefficients. As result, number variables be estimated can significantly reduced, leading substantial savings in and memory usage. To best our knowledge, proposed approach first attempt develop methods problem general data matrix. Our major contributions are: 1) derive new dual formulation comes several desirable properties; 2) show equivalent standard by two affine transformations; 3) framework developing effective efficient f used La sso via m onotonicity s ubdifferentials (FLAMS). Some appealing FLAMS safe sense detected guaranteed have coefficients; dataset needs scanned only once run screening, whose negligible compared solving Lasso; (3) independent integrated any solvers. We evaluated on both synthetic real datasets. The experiments indicate very identifying speedup gained orders magnitude.

参考文章(26)
Kohei Ogawa, Yoshiki Suzuki, Ichiro Takeuchi, Safe Screening of Non-Support Vectors in Pathwise SVM Computation international conference on machine learning. pp. 1382- 1390 ,(2013)
Patrick L. Combettes, Heinz H. Bauschke, Convex Analysis and Monotone Operator Theory in Hilbert Spaces ,(2011)
Osman Güler, Foundations of Optimization Graduate Texts in Mathematics. ,(2010) , 10.1007/978-0-387-68407-9
Tarek Rabbani, Laurent El Ghaoui, Vivian Viallon, Safe Feature Elimination in Sparse Supervised Learning arXiv: Learning. ,(2010)
Rebecca A. Betensky, Andreas Von Deimling, J. Gregory Cairncross, Catherine L. Nutt, Todd R. Golub, Todd R. Golub, Ute Pohl, Peter M. Black, David N. Louis, Scott L. Pomeroy, Tracy T. Batchelor, Christine Ladd, Margaret E. McLaughlin, Pablo Tamayo, D. R. Mani, Christian Hartmann, Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Research. ,vol. 63, pp. 1602- 1607 ,(2003)
R. Tyrrell Rockafellar, Conjugate Duality and Optimization ,(1987)
Jie Wang, Peter Wonka, Peter Wonka, Jieping Ye, Scaling SVM and Least Absolute Deviations via Exact Data Reduction international conference on machine learning. pp. 523- 531 ,(2014)
A. Ahmed, E. P. Xing, Recovering time-varying networks of dependencies in social and biological studies. Proceedings of the National Academy of Sciences of the United States of America. ,vol. 106, pp. 11878- 11883 ,(2009) , 10.1073/PNAS.0901910106
Laurent Condat, A Direct Algorithm for 1-D Total Variation Denoising IEEE Signal Processing Letters. ,vol. 20, pp. 1054- 1057 ,(2013) , 10.1109/LSP.2013.2278339
Jiayu Zhou, Jie Wang, Peter Wonka, Jieping Ye, Lasso Screening Rules via Dual Polytope Projection neural information processing systems. ,vol. 26, pp. 1070- 1078 ,(2013)