Comparing and unifying search-based and similarity-based approaches to semi-supervised clustering

作者: Sugato Basu , Mikhail Bilenko , Raymond J Mooney

DOI:

关键词:

摘要: Semi-supervised clustering employs a small amount of labeled data to aid unsupervised learning. Previous work in the area has employed one of two approaches: 1) Searchbased methods that utilize supervised data to guide the search for the best clustering, and 2) Similarity-based methods that use supervised data to adapt the underlying similarity metric used by the clustering algorithm. This paper presents a unified approach based on the K-Means clustering algorithm that incorporates both of these techniques. Experimental results demonstrate that the combined approach generally produces better clusters than either of the individual approaches.

参考文章(0)