Convex Principal Feature Selection.

作者: Mahdokht Masaeli , Glenn Fung , Jennifer G. Dy , Yan Yan , Ying Cui

DOI:

关键词: Pattern recognitionPrincipal (computer security)Feature selectionDimensionality reductionArtificial intelligenceFeature (computer vision)Redundancy (engineering)Transformation (function)Principal component analysisSparse PCAComputer science

摘要: A popular approach for dimensionality reduction and data analysis is principal component (PCA). limiting factor with PCA that it does not inform us on which of the original features are important. There a recent interest in sparse (SPCA). By applying an L1 regularizer to PCA, transformation achieved. However, true feature selection may be achieved as non-sparse coefficients distributed over several features. Feature NP-hard combinatorial optimization problem. This paper relaxes re-formulates problem convex continuous minimizes mean-squared-reconstruction error (a criterion optimized by PCA) considers redundancy into account (an important property selection). We call this new method Convex Principal Selection (CPFS). Experiments show CPFS performed better than SPCA selecting maximize variance or minimize mean-squaredreconstruction error.

参考文章(14)
Hui Zou, Trevor Hastie, Robert Tibshirani, Sparse Principal Component Analysis Journal of Computational and Graphical Statistics. ,vol. 15, pp. 265- 286 ,(2006) , 10.1198/106186006X113430
Yijuan Lu, Ira Cohen, Xiang Sean Zhou, Qi Tian, Feature selection using principal feature analysis Proceedings of the 15th international conference on Multimedia - MULTIMEDIA '07. pp. 301- 304 ,(2007) , 10.1145/1291233.1291297
Jorge Nocedal, Updating Quasi-Newton Matrices With Limited Storage Mathematics of Computation. ,vol. 35, pp. 773- 782 ,(1980) , 10.1090/S0025-5718-1980-0572855-7
C. L. Blake, UCI Repository of machine learning databases www.ics.uci.edu/〜mlearn/MLRepository.html. ,(1998)
T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, E. S. Lander, Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. ,vol. 286, pp. 531- 537 ,(1999) , 10.1126/SCIENCE.286.5439.531
Isabelle Guyon, André Elisseeff, An introduction to variable and feature selection Journal of Machine Learning Research. ,vol. 3, pp. 1157- 1182 ,(2003) , 10.1162/153244303322753616
Robert Tibshirani, Regression Shrinkage and Selection Via the Lasso Journal of the Royal Statistical Society: Series B (Methodological). ,vol. 58, pp. 267- 288 ,(1996) , 10.1111/J.2517-6161.1996.TB02080.X
Jennifer G. Dy, Carla E. Brodley, Avi Kak, Lynn S. Broderick, Alex M. Aisen, Unsupervised feature selection applied to content-based retrieval of lung images IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 25, pp. 373- 378 ,(2003) , 10.1109/TPAMI.2003.1182100