Stable feature selection via dense feature groups

作者: Lei Yu , Chris Ding , Steven Loscalzo

DOI: 10.1145/1401890.1401986

关键词: Selection (genetic algorithm)Stability (learning theory)Minimum redundancy feature selectionFeature (computer vision)k-nearest neighbors algorithmDimensionality reductionArtificial intelligenceClustering high-dimensional dataData miningPattern recognitionMathematicsFeature selection

摘要: Many feature selection algorithms have been proposed in the past focusing on improving classification accuracy. In this work, we point out importance of stable for knowledge discovery from high-dimensional data, and identify two causes instability algorithms: a minimum subset without redundant features small sample size. We propose general framework which emphasizes both good generalization stability results. The identifies dense groups based kernel density estimation treats each group as coherent entity selection. An efficient algorithm DRAGS (Dense Relevant Attribute Group Selector) is developed under framework. also introduce measure assessing algorithms. Our empirical study microarray data verifies that remain random hold out, effective identifying set exhibit high accuracy stability.

参考文章(2)
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
Yizong Cheng, Mean shift, mode seeking, and clustering IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 17, pp. 790- 799 ,(1995) , 10.1109/34.400568