A Cluster Description Method for High Dimensional Data Clustering with Categorical Variables

作者: Sen Wu , Shujuan Gu

DOI: 10.1109/ICMTMA.2010.147

关键词:

摘要: High dimensional data clustering is always of great difficulty in research. Before the process accomplished, partition objects unknown. Therefore after process, results final clusters should be presented understandably, which will strictly difficult when it comes to high dimensionality. This paper presents a cluster description schema for with categorical variables. The this uses supremum and infimum represent concisely based on new method given assign non-sample obtained from sample space. distribution requires one-time scan dataset, updates dynamically, can detect isolated objects. Experiments both synthetic real show its effectiveness scalability.

参考文章(11)
Martin Ester, Byron J. Gao, Cluster Description Formats, Problems and Algorithms. siam international conference on data mining. pp. 464- 468 ,(2006)
XuedongGao, SenWu, CABOSFV algorithm for high dimensional sparse data clustering 北京科技大学学报:英文版. ,vol. 11, pp. 283- 288 ,(2004)
Laks V.S. Lakshmanan, Raymond T. Ng, Christine Xing Wang, Xiaodong Zhou, Theodore J. Johnson, The generalized MDL approach for summarization very large data bases. pp. 766- 777 ,(2002) , 10.1016/B978-155860869-6/50073-1
Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan, Automatic subspace clustering of high dimensional data for data mining applications Proceedings of the 1998 ACM SIGMOD international conference on Management of data - SIGMOD '98. ,vol. 27, pp. 94- 105 ,(1998) , 10.1145/276304.276314
Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishnan, CACTUS—clustering categorical data using summaries knowledge discovery and data mining. pp. 73- 83 ,(1999) , 10.1145/312129.312201
M.J. Zaki, M. Peters, CLICKS: Mining Subspace Clusters in Categorical Data via K-Partite Maximal Cliques international conference on data engineering. pp. 355- 356 ,(2005) , 10.1109/ICDE.2005.33
Micheline Kamber, Jiawei Han, Jian Pei, Data Mining: Concepts and Techniques ,(2000)
Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, Jörg Sander, OPTICS: ordering points to identify the clustering structure international conference on management of data. ,vol. 28, pp. 49- 60 ,(1999) , 10.1145/304181.304187
YU Be, VISUAL CLUSTERING FOR HIGH DIMENSIONAL DATA BASED ON NEAREST NEIGHBOR Journal of Computer Research and Development. ,(2000)