作者: Byron J. Gao , Martin Ester , Jin-Yi Cai , Oliver Schulte , Hui Xiong
关键词:
摘要: In this paper, we introduce and study the Minimum Consistent Subset Cover (MCSC) problem. Given a finite ground set X constraint t, find minimum number of consistent subsets that cover X, where subset is if it satisfies t. The MCSC problem generalizes traditional covering has Clique Partition, dual graph coloring, as an instance. Many practical data mining problems in areas rule learning, clustering, frequent pattern can be formulated instances. particular, discuss Rule Set minimizes model complexity decision rules well some converse k-clustering minimize clusters satisfying certain distance constraints. We also show how applications summarization. For any these formulations, our proposed novel graph-based generic algorithm CAG directly applicable. starts by constructing maximal optimal partial solution, then performs example-driven specific-to-general search on dynamically maintained bipartite assignment to simultaneously learn with small cardinality set. Our experiments benchmark datasets achieves good results compared existing popular heuristics.