Investigation and Reduction of Discretization Variance in Decision Tree Induction

作者: Pierre Geurts , Louis Wehenkel

DOI: 10.1007/3-540-45164-1_17

关键词: Reduction (complexity)InterpretabilityDiscretization of continuous featuresStability (learning theory)Decision treeMathematicsStatisticsMathematical optimizationDiscretizationVariance reductionVariance (accounting)

摘要: This paper focuses on the variance introduced by discretization techniques used to handle continuous attributes in decision tree induction. Different procedures are first studied empirically, then means reduce proposed. The experiment shows that is large and it possible significantly without notable computational costs. resulting reduction mainly improves interpretability stability of trees, marginally their accuracy.

参考文章(12)
Ron Kohavi, David Wolpert, Bias plus variance decomposition for zero-one loss functions international conference on machine learning. pp. 275- 283 ,(1996)
Richard A Olshen, Charles J Stone, Leo Breiman, Jerome H Friedman, Classification and regression trees ,(1983)
Chris Carter, Jason Catlett, Assessing Credit Card Applications Using Machine Learning IEEE Intelligent Systems. ,vol. 2, pp. 71- 79 ,(1987) , 10.1109/MEX.1987.4307093
Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting conference on learning theory. ,vol. 55, pp. 119- 139 ,(1997) , 10.1006/JCSS.1997.1504
Friedman, A Recursive Partitioning Decision Rule for Nonparametric Classification IEEE Transactions on Computers. ,vol. 26, pp. 404- 408 ,(1977) , 10.1109/TC.1977.1674849
Michael I. Jordan, A statistical approach to decision tree modeling Proceedings of the seventh annual conference on Computational learning theory - COLT '94. pp. 13- 20 ,(1994) , 10.1145/180139.175372
J. Ross Quinlan, C4.5: Programs for Machine Learning ,(1992)
Wray Buntine, Learning classification trees Statistics and Computing. ,vol. 2, pp. 63- 73 ,(1992) , 10.1007/BF01889584