An Analysis of Rule Learning Heuristics

作者: Johannes Fürnkranz , Peter A. Flach

DOI:

关键词: Area under the roc curveHeuristicsMathematical optimizationComputer scienceEntropy (information theory)Information gain

摘要: In this paper we analyze the most popular search heuristics for separate-andconquer rule learning algorithms. Our results show that all commonly used heuristics, including accuracy, weighted relative entropy, Gini index and information gain, are equivalent to one of two fundamental prototypes: precision, which tries optimize area under ROC curve unknown costs, a cost-eighted difference between covered positive negative examples, find optimal point known or assumed costs. We also straight-forward generalization m-heuristic is means trading off these prototypes.

参考文章(23)
Sholom M. Weiss, Nitin Indurkhya, Reduced complexity rule induction international joint conference on artificial intelligence. pp. 678- 684 ,(1991)
Johannes Fürnkranz, FOSSIL: a robust relational learner european conference on machine learning. pp. 122- 137 ,(1994) , 10.1007/3-540-57868-4_54
Bojan Cestnik, Estimating probabilities: a crucial task in machine learning european conference on artificial intelligence. pp. 147- 149 ,(1990)
José Hernández-Orallo, Peter A. Flach, César Ferri, Learning Decision Trees Using the Area Under the ROC Curve international conference on machine learning. pp. 139- 146 ,(2002)
Luc Raedt, Wim Laer, Inductive Constraint Logic algorithmic learning theory. ,vol. 997, pp. 80- 94 ,(1995) , 10.1007/3-540-60454-5_30
Richard A Olshen, Charles J Stone, Leo Breiman, Jerome H Friedman, Classification and regression trees ,(1983)
William W. Cohen, Fast Effective Rule Induction Machine Learning Proceedings 1995. pp. 115- 123 ,(1995) , 10.1016/B978-1-55860-377-6.50023-2
Johannes Fürnkranz, Separate-and-Conquer Rule Learning Artificial Intelligence Review. ,vol. 13, pp. 3- 54 ,(1999) , 10.1023/A:1006524209794
Ljupčo Todorovski, Peter Flach, Nada Lavrač, Predictive Performance of Weghted Relative Accuracy european conference on principles of data mining and knowledge discovery. pp. 255- 264 ,(2000) , 10.1007/3-540-45372-5_25
Peter Clark, Robin Boswell, Rule induction with CN2: Some recent improvements Lecture Notes in Computer Science. pp. 151- 163 ,(1991) , 10.1007/BFB0017011