Evaluation Measures for Multi-class Subgroup Discovery

作者: Tarek Abudawood , Peter Flach

DOI: 10.1007/978-3-642-04180-8_20

关键词:

摘要: Subgroup discovery aims at finding subsets of a population whose class distribution is significantly different from the overall distribution. It has previously predominantly been investigated in two-class context. This paper investigates multi-class subgroup methods. We consider six evaluation measures for subgroups, four them new, and study their theoretical properties. extend algorithm CN2-SD to incorporate new weighting scheme inspired by AdaBoost. demonstrate usefulness experimentally, using discovered subgroups as features decision tree learner. Not only number leaves reduced with factor between 8 16 on average, but significant improvements accuracy AUC are achieved particular settings. Similar performance can be observed when naive Bayes.

参考文章(26)
Henrik Bostrom, Covering vs divide-and-conquer for top-down induction of logic programs international joint conference on artificial intelligence. pp. 1194- 1200 ,(1995)
JEROME H. FRIEDMAN, NICHOLAS I. FISHER, Bump hunting in high-dimensional data Statistics and Computing. ,vol. 9, pp. 123- 143 ,(1999) , 10.1023/A:1008894516817
Janez Demšar, Statistical Comparisons of Classifiers over Multiple Data Sets Journal of Machine Learning Research. ,vol. 7, pp. 1- 30 ,(2006)
Mark A. Hall, Ian H. Witten, Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques ,(1999)
Boonserm Kijsirikul, Nitiwut Ussivakul, Surapant Meknavin, Adaptive Directed Acyclic Graphs for Multiclass Classification pacific rim international conference on artificial intelligence. pp. 158- 168 ,(2002) , 10.1007/3-540-45683-X_19
Xin Jin, Anbang Xu, Rongfang Bie, Ping Guo, Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles international conference on data mining. pp. 106- 115 ,(2006) , 10.1007/11691730_11
Peter Clark, Robin Boswell, Rule induction with CN2: Some recent improvements Lecture Notes in Computer Science. pp. 151- 163 ,(1991) , 10.1007/BFB0017011
Willi Klösgen, Jan M Zytkow, None, Handbook of Data Mining and Knowledge Discovery ,(2002)
Johannes F�rnkranz, Peter A. Flach, ROC 'n' rule learning: towards a better understanding of covering algorithms Machine Learning. ,vol. 58, pp. 39- 77 ,(2005) , 10.1007/S10994-005-5011-X