Hierarchical Subspace Clustering

作者: Elke Achtert

DOI:

关键词:

摘要: It is well-known that traditional clustering methods considering all dimensions of the feature space usually fail in terms efficiency and effectivity when applied to high-dimensional data. This poor behavior based on fact clusters may not be found space, although exist subspaces space. To overcome these limitations methods, several for subspace have been proposed recently. Subspace algorithms aim at automatically identifying lower dimensional which exist. There two types algorithms: Algorithms detecting axis-parallel and, as an extension, finding are arbitrarily oriented. Generally, hierarchically nested, i.e., low dimensionality form a cluster higher dimensionality. Since existing able detect complex structures, hierarchical approaches applied. The goal this dissertation develop new efficient effective by novel challenges approach proposing innovative solid solutions challenges. The first Part work deals with analysis subspaces. Two search simultaneously arbitrary order hierarchies clusters. Furthermore, visualization model result means graph representation provided. In second oriented discussed. The so-called correlation can seen extension clustering. Correlation aims grouping data set into subsets, clusters, such objects same show uniform attribute correlations. combine density-based Principal Component Analysis identify clusters. The last addresses interpretation results obtained from algorithms. A general method introduced extract quantitative information linear dependencies between given models used predict probability object created one models. Both, effectiveness presented techniques thoroughly analyzed. benefits over shown evaluating synthetic well real-world test sets.

参考文章(44)
Elke Achtert, Christian Böhm, Hans-Peter Kriegel, Peer Kröger, Ina Müller-Gorman, Arthur Zimek, Detection and Visualization of Subspace Cluster Hierarchies Advances in Databases: Concepts, Systems and Applications. pp. 152- 163 ,(2007) , 10.1007/978-3-540-71703-4_15
Ramakrishnan Srikant, Rakesh Agrawal, Fast algorithms for mining association rules very large data bases. pp. 580- 592 ,(1998)
George M. Church, Yizong Cheng, Biclustering of Expression Data intelligent systems in molecular biology. ,vol. 8, pp. 93- 103 ,(2000)
U. Ruckert, L. Richter, S. Kramer, Quantitative association rules based on half-spaces: an optimization approach international conference on data mining. pp. 507- 510 ,(2004) , 10.1109/ICDM.2004.10038
Karin Kailing, Hans-Peter Kriegel, Peer Kroger, Density-Connected Subspace Clustering for High-Dimensional Data siam international conference on data mining. pp. 246- 256 ,(2004)
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
Raymond T. Ng, Jiawei Han, Efficient and Effective Clustering Methods for Spatial Data Mining very large data bases. pp. 144- 155 ,(1994)
Hans-Peter Kriegel, Martin Ester, Jörg Sander, Xiaowei Xu, A density-based algorithm for discovering clusters in large spatial Databases with Noise knowledge discovery and data mining. pp. 226- 231 ,(1996)
Elke Achtert, Christian Böhm, Hans-Peter Kriegel, Peer Kröger, Ina Müller-Gorman, Arthur Zimek, Finding Hierarchies of Subspace Clusters Lecture Notes in Computer Science. pp. 446- 453 ,(2006) , 10.1007/11871637_42
Aristides Gionis, Alexander Hinneburg, Spiros Papadimitriou, Panayiotis Tsaparas, Dimension induced clustering knowledge discovery and data mining. pp. 51- 60 ,(2005) , 10.1145/1081870.1081880