Advancing data clustering via projective clustering ensembles

作者: Francesco Gullo , Carlotta Domeniconi , Andrea Tagarelli

DOI: 10.1145/1989323.1989400

关键词: FLAME clusteringComputer scienceClustering high-dimensional dataFuzzy clusteringCanopy clustering algorithmData miningDimensionality reductionCorrelation clusteringConsensus clusteringBiclusteringCURE data clustering algorithmData stream clusteringBrown clusteringCluster analysisTheoretical computer scienceEnsemble learning

摘要: Projective Clustering Ensembles (PCE) are a very recent advance in data clustering research which combines the two powerful tools of ensembles and projective clustering.Specifically, PCE enables ensemble methods to handle composed by solutions. has been formalized as an optimization problem with either two-objective or single-objective function. Two-objective shown generally produce more accurate results than its counterpart, although it can object-based feature-based cluster representations only independently one other. Moreover, both early formulations do not follow any standard approaches ensembles, namely instance-based, cluster-based, hybrid. In this paper, we propose alternative formulation overcomes above issues. We investigate drawbacks define new problem. This is capable treating object- whole, essentially tying them distance computation between solution given ensemble. cluster-based algorithms for computing approximations proposed formulation, have common merit conforming ensembles. Experiments on benchmark datasets significance our heuristics outperform existing methods.

参考文章(36)
Bruno Leclerc, Jean-Pierre Barthélemy, The Median Procedure for Partitions. Partitioning Data Sets. pp. 3- 34 ,(1993)
Andrea Tagarelli, Francesco Gullo, Sergio Greco, Diversity-Based Weighting Schemes for Clustering Ensembles. siam international conference on data mining. pp. 437- 448 ,(2009)
Elke Achtert, Christian Böhm, Hans-Peter Kriegel, Peer Kröger, Ina Müller-Gorman, Arthur Zimek, Detection and Visualization of Subspace Cluster Hierarchies Advances in Databases: Concepts, Systems and Applications. pp. 152- 163 ,(2007) , 10.1007/978-3-540-71703-4_15
K. Sequeira, M. Zaki, SCHISM: a new approach for interesting subspace mining international conference on data mining. pp. 186- 193 ,(2004) , 10.1109/ICDM.2004.10099
Hanan Ayad, Mohamed Kamel, Finding Natural Clusters Using Multi-clusterer Combiner Based on Shared Nearest Neighbors Multiple Classifier Systems. pp. 166- 175 ,(2003) , 10.1007/3-540-44938-8_17
Joydeep Ghosh, Raymond Mooney, Alexander Strehl, Impact of Similarity Measures on Web-page Clustering ,(2000)
Constantinos Boulis, Mari Ostendorf, Combining multiple clustering systems european conference on principles of data mining and knowledge discovery. ,vol. 3202, pp. 63- 74 ,(2004) , 10.1007/978-3-540-30116-5_9
Evgenia Dimitriadou, Andreas Weingessel, Kurt Hornik, Voting-Merging: An Ensemble Method for Clustering Artificial Neural Networks — ICANN 2001. pp. 217- 224 ,(2001) , 10.1007/3-540-44668-0_31
Ana Fred, Finding Consistent Clusters in Data Partitions multiple classifier systems. pp. 309- 318 ,(2001) , 10.1007/3-540-48219-9_31
Guojun Gan, Chaoqun Ma, Jianhong Wu, None, Data Clustering: Theory, Algorithms, and Applications ,(2007)