作者: Tarek Abudawood , Peter Flach
DOI: 10.1007/978-3-642-04180-8_20
关键词:
摘要: Subgroup discovery aims at finding subsets of a population whose class distribution is significantly different from the overall distribution. It has previously predominantly been investigated in two-class context. This paper investigates multi-class subgroup methods. We consider six evaluation measures for subgroups, four them new, and study their theoretical properties. extend algorithm CN2-SD to incorporate new weighting scheme inspired by AdaBoost. demonstrate usefulness experimentally, using discovered subgroups as features decision tree learner. Not only number leaves reduced with factor between 8 16 on average, but significant improvements accuracy AUC are achieved particular settings. Similar performance can be observed when naive Bayes.