Automatic extraction of clusters from hierarchical clustering representations

作者: Jörg Sander , Nan Niu , Zhiyong Lu , Alex Kovarsky , Xuejie Qin

DOI: 10.5555/1760894.1760906

关键词:

摘要: Hierarchical clustering algorithms are typically more effective in detecting the true structure of a data set than partitioning algorithms. However, hierarchical do not actually create clusters, but compute only representation set. This makes them unsuitable as an automatic pre-processing step for other that operate on detected clusters. is both dendrograms and reachability plots, which have been proposed representations, different advantages disadvantages. In this paper we first investigate relation between plots introduce methods to convert into each showing they essentially contain same information. Based then technique automatically determines significant clusters cluster representation. it time possible use requires no user interaction select from

参考文章(12)
Finding Groups in Data John Wiley & Sons, Inc.. ,(1990) , 10.1002/9780470316801
Raymond T. Ng, Jiawei Han, Efficient and Effective Clustering Methods for Spatial Data Mining very large data bases. pp. 144- 155 ,(1994)
Gholamhosein Sheikholeslami, Surojit Chatterjee, Aidong Zhang, WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases very large data bases. pp. 428- 439 ,(1998)
Alexander Hinneburg, Daniel A. Keim, An efficient approach to clustering in large multimedia databases with noise knowledge discovery and data mining. pp. 58- 65 ,(1998)
Hans-Peter Kriegel, Martin Ester, Jörg Sander, Xiaowei Xu, A density-based algorithm for discovering clusters in large spatial Databases with Noise knowledge discovery and data mining. pp. 226- 231 ,(1996)
Richard C. Dubes, Anil K. Jain, Algorithms for clustering data ,(1988)
Edwin M Knorr, Raymond T Ng, Finding aggregate proximity relationships and commonalities in spatial data mining IEEE Transactions on Knowledge and Data Engineering. ,vol. 8, pp. 884- 897 ,(1996) , 10.1109/69.553156
C.V. Ramamoorthy, B.W. Wah, Knowledge and data engineering IEEE Transactions on Knowledge and Data Engineering. ,vol. 1, pp. 9- 16 ,(1989) , 10.1109/69.43400
R. Sibson, SLINK: An optimally efficient algorithm for the single-link cluster method The Computer Journal. ,vol. 16, pp. 30- 34 ,(1973) , 10.1093/COMJNL/16.1.30
J. B. Macqueen, Some methods for classification and analysis of multivariate observations Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics. ,vol. 1, pp. 281- 297 ,(1967)