Dimension reduction using clustering algorithm and rough set theory

作者: Shampa Sengupta , Asit Kumar Das

DOI: 10.1007/978-3-642-35380-2_82

关键词:

摘要: In real world, datasets have large number of attributes but few are important to describe them properly. The paper proposes a novel dimension reduction algorithm for valued dataset using the concept Rough Set Theory and clustering generate reduct. Here, projection based on two conditional Ci Cj is taken K-means Clustering applied it with K = distinct values decision attribute D obtain clusters. Also clustered into K-groups Indiscernibility relation D. Then connecting factor k combined (CiCj) respect calculated cluster sets set ACS {(CiCj$\rightarrow^{\hspace*{-2.5mm}^k} D$) all Ci,Cj ∈ C, Conditional set, (Decision attribute)} formed. Each element (CiCj$\rightarrow^{\hspace*{-2.5mm}^k} implies that together partition objects yields (k*100) % similar partitions as made Now an undirected weighted graph weights constructed ACS. Finally weight associated edges, attributes, called reduct generated. Experimental result shows efficiency proposed method.

参考文章(13)
Rudy Setiono, Huan Liu, A probabilistic approach to feature selection - a filter solution international conference on machine learning. pp. 319- 327 ,(1996)
Asit Kumar Das, Shampa Sengupta, Saikat Chakrabarty, Reduct Generation by Formation of Directed Minimal Spanning Tree Using Rough Set Theory Springer, Berlin, Heidelberg. pp. 127- 135 ,(2012) , 10.1007/978-3-642-27443-5_15
M. Hall, Correlation-based Feature Selection for Machine Learning PhD Thesis, Waikato Univer-sity. ,(1998)
Andrzej Skowron, Ning Zhong, A rough set-based knowledge discovery process International Journal of Applied Mathematics and Computer Science. ,vol. 11, pp. 603- 619 ,(2001)
Gregory Z. Gutin, Jrgen Bang-Jensen, Digraphs: Theory, Algorithms and Applications ,(2002)
K. Thangavel, A. Pethalakshmi, Review: Dimensionality reduction based on rough set theory: A review soft computing. ,vol. 9, pp. 1- 12 ,(2009) , 10.1016/J.ASOC.2008.05.006
ZDZISLAW PAWLAK, ROUGH SET THEORY AND ITS APPLICATIONS TO DATA ANALYSIS Cybernetics and Systems. ,vol. 29, pp. 661- 688 ,(1998) , 10.1080/019697298125470
C. L. Blake, UCI Repository of machine learning databases www.ics.uci.edu/〜mlearn/MLRepository.html. ,(1998)
S. Della Pietra, V. Della Pietra, J. Lafferty, Inducing features of random fields IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 19, pp. 380- 393 ,(1997) , 10.1109/34.588021
Richard Jensen, Qiang Shen, Fuzzy rough attribute reduction with application to web categorization Fuzzy Sets and Systems. ,vol. 141, pp. 469- 485 ,(2004) , 10.1016/S0165-0114(03)00021-6