摘要: We consider the following problem: given a set of clusterings, find single clustering that agrees as much possible with input clusterings. This problem, aggregation, appears naturally in various contexts. For example, categorical data is an instance aggregation problem; each attribute can be viewed rows where are grouped together if they take same value on attribute. Clustering also used metaclustering method to improve robustness by combining output multiple algorithms. Furthermore, problem formulation does not require priori information about number clusters; it determined optimization function.In this article, we give formal statement and propose Our algorithms make use connection between correlation clustering. Although problems NP-hard, for several our methods, provide theoretical guarantees quality solutions. work provides best deterministic approximation algorithm variation consider. show how sampling scale large datasets. extensive empirical evaluation demonstrating usefulness