作者: Martin Ester , Byron J. Gao
DOI:
关键词: Algorithm 、 Minimum description length 、 Interpretability 、 Cluster analysis 、 Computer science 、 Cluster (physics) 、 Data mining 、 Heuristic (computer science)
摘要: Clustering is one of the major data mining tasks. So far, database and literature lacks systematic study cluster descriptions, which are essential to provide user with understandable knowledge clusters support further interactive exploration. In this paper, we introduce novel description formats leading more descriptive power. We define two alternative problems generating Minimum Description Length Maximum Accuracy, providing different trade-offs between interpretability accuracy. also present heuristic algorithms for both problems, together their empirical evaluation comparison state-of-the-art algorithms.