Farthest-Point Heuristic based Initialization Methods for K-Modes Clustering

作者: Zengyou He

DOI:

关键词:

摘要: The k-modes algorithm has become a popular technique in solving categorical data clustering problems different application domains. However, the requires random selection of initial points for clusters. Different often lead to considerable distinct results. In this paper we present an experimental study on applying farthest-point heuristic based initialization method improve its performance. Experiments show that new leads better accuracy than clustering.

参考文章(12)
Guojun Gan, Zijiang Yang, Jianhong Wu, A Genetic k-Modes Algorithm for Clustering Categorical Data Advanced Data Mining and Applications. pp. 195- 202 ,(2005) , 10.1007/11527503_23
Teofilo F. Gonzalez, Clustering to minimize the maximum intercluster distance Theoretical Computer Science. ,vol. 38, pp. 293- 306 ,(1985) , 10.1016/0304-3975(85)90224-5
Dae-Won Kim, KiYoung Lee, Doheon Lee, Kwang H. Lee, Rapid and brief communication: A k-populations algorithm for clustering categorical data Pattern Recognition. ,vol. 38, pp. 1131- 1134 ,(2005) , 10.1016/J.PATCOG.2004.11.017
Ying Sun, Qiuming Zhu, Zhengxin Chen, An iterative initial-points refinement algorithm for categorical data clustering Pattern Recognition Letters. ,vol. 23, pp. 875- 884 ,(2002) , 10.1016/S0167-8655(01)00163-5
Dae-Won Kim, Kwang H Lee, Doheon Lee, Fuzzy clustering of categorical data using fuzzy centroids Pattern Recognition Letters. ,vol. 25, pp. 1263- 1271 ,(2004) , 10.1016/J.PATREC.2004.04.004
Michael K. Ng, Joyce C. Wong, Clustering categorical data sets using tabu search techniques Pattern Recognition. ,vol. 35, pp. 2783- 2790 ,(2002) , 10.1016/S0031-3203(02)00021-3
C. L. Blake, UCI Repository of machine learning databases www.ics.uci.edu/〜mlearn/MLRepository.html. ,(1998)
Zhexue Huang, Michael K Ng, A fuzzy k-modes algorithm for clustering categorical data IEEE Transactions on Fuzzy Systems. ,vol. 7, pp. 446- 452 ,(1999) , 10.1109/91.784206
Tomás Feder, Daniel Greene, Optimal algorithms for approximate clustering symposium on the theory of computing. pp. 434- 444 ,(1988) , 10.1145/62212.62255
Zhexue Huang, Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values Data Mining and Knowledge Discovery. ,vol. 2, pp. 283- 304 ,(1998) , 10.1023/A:1009769707641