Clustering with Propagated Constraints

作者: Eric Robert Eaton

DOI:

关键词: Constrained clusteringPairwise comparisonCluster analysisQuality (business)Focus (optics)Theoretical computer scienceGaussian functionCluster (physics)Computer sciencePerspective (graphical)

摘要: Title of Thesis: Clustering with Propagated Constraints Eric Robert Eaton, Master Science, 2005 Thesis directed by: Dr. Marie desJardins, Assistant Professor Department Computer Science and Electrical Engineering Background knowledge in the form constraints can dramatically improve quality generated clustering models. In constrained clustering, these typically specify relative cluster membership pairs points. They are tedious to expensive from a user perspective, yet very useful large quantities. Existing methods perform well when given quantities constraints, but do not focus on performing small This thesis focuses providing high-quality constraints. It proposes method for propagating pairwise nearby instances using Gaussian function. takes few easily specified propagates them points constrain local neighborhood. propagated yield superior performance fewer than only original user-specified The experiments compare that established algorithms several real-world data sets.

参考文章(25)
James Franklin, The elements of statistical learning : data mining, inference,and prediction The Mathematical Intelligencer. ,vol. 27, pp. 83- 85 ,(2005) , 10.1007/BF02985802
Rich Caruana, Andrew Kachites McCallum, David Cohn, Semi-Supervised Clustering with User Feedback Cornell University. ,(2003)
Sepandar D. Kamvar, Christopher D. Manning, Dan Klein, From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering international conference on machine learning. pp. 307- 314 ,(2002)
Noam Shental, Tomer Hertz, Daphna Weinshall, Misha Pavel, Adjustment Learning and Relevant Component Analysis european conference on computer vision. pp. 776- 792 ,(2002) , 10.1007/3-540-47979-1_52
Zoubin Ghahramani, John D. Lafferty, Xiaojin Zhu, Semi-supervised learning : from Gaussian fields to Gaussian processes Carnegie Mellon University: USA.. ,(2003) , 10.1184/R1/6609434.V1
Raymond J. Mooney, Sugato Basu, Semi-supervised clustering: probabilistic models, algorithms and experiments University of Texas at Austin. ,(2005)
William M. Rand, Objective Criteria for the Evaluation of Clustering Methods Journal of the American Statistical Association. ,vol. 66, pp. 846- 850 ,(1971) , 10.1080/01621459.1971.10482356
C. L. Blake, UCI Repository of machine learning databases www.ics.uci.edu/〜mlearn/MLRepository.html. ,(1998)
Mikhail Bilenko, Sugato Basu, Raymond J. Mooney, Integrating constraints and metric learning in semi-supervised clustering international conference on machine learning. pp. 11- ,(2004) , 10.1145/1015330.1015360
Arindam Banerjee, Raymond J. Mooney, Sugato Basu, Semi-supervised Clustering by Seeding international conference on machine learning. pp. 27- 34 ,(2002)