A unifying criterion for unsupervised clustering and feature selection

作者: Mihaela Breaban , Henri Luchian

DOI: 10.1016/J.PATCOG.2010.10.006

关键词: Synthetic dataFeature selectionMachine learningData miningOptimization problemMathematicsExploratory data analysisArtificial intelligenceHeuristicsGlobal optimizationFeature extractionUnsupervised learning

摘要: Exploratory data analysis methods are essential for getting insight into data. Identifying the most important variables and detecting quasi-homogenous groups of problems interest in this context. Solving such is a difficult task, mainly due to unsupervised nature underlying learning process. Unsupervised feature selection clustering can be successfully approached as optimization by means global heuristics if an appropriate objective function considered. This paper introduces capable efficiently guiding search significant features simultaneously respective optimal partitions. Experiments conducted on complex synthetic suggest that we propose unbiased with respect both number clusters features.

参考文章(25)
Bhavani Raskutti, Christopher Leckie, An evaluation of criteria for measuring the quality of clusters international joint conference on artificial intelligence. pp. 905- 910 ,(1999)
Christian Borgelt, Fuzzy Subspace Clustering Advances in Data Analysis, Data Handling and Business Intelligence. pp. 93- 103 ,(2009) , 10.1007/978-3-642-01044-6_8
J. Handl, J. Knowles, Improvements to the scalability of multiobjective clustering congress on evolutionary computation. ,vol. 3, pp. 2372- 2379 ,(2005) , 10.1109/CEC.2005.1554990
S. Luchian, H. Luchian, M. Petriuc, Evolutionary automated classification world congress on computational intelligence. pp. 585- 588 ,(1994) , 10.1109/ICEC.1994.349994
Luis Talavera, Feature Selection as a Preprocessing Step for Hierarchical Clustering international conference on machine learning. pp. 389- 397 ,(1999)
Glenn W. Milligan, Martha C. Cooper, An examination of procedures for determining the number of clusters in a data set Psychometrika. ,vol. 50, pp. 159- 179 ,(1985) , 10.1007/BF02294245
Peter J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis Journal of Computational and Applied Mathematics. ,vol. 20, pp. 53- 65 ,(1987) , 10.1016/0377-0427(87)90125-7
Minho Kim, R.S. Ramakrishna, New indices for cluster validity assessment Pattern Recognition Letters. ,vol. 26, pp. 2353- 2363 ,(2005) , 10.1016/J.PATREC.2005.04.007