A Data Complexity Analysis of Comparative Advantages of Decision Forest Constructors

作者: Tin Kam Ho

DOI: 10.1007/S100440200009

关键词:

摘要: Using a number of measures for characterising the complexity classification problems, we studied comparative advantages two methods constructing decision forests – bootstrapping and random subspaces. We investigated collection 392 two-class problems from UCI depository, observed that there are strong correlations between classifier accuracies length class boundaries, thickness manifolds, nonlinearities boundaries. found characteristics both difficult easy cases where combination no better than single classifiers. Also, method is when training samples sparse, subspace classes compact boundaries smooth.

参考文章(31)
S. K. Murthy, S. Kasif, S. Salzberg, A system for induction of oblique decision trees Journal of Artificial Intelligence Research. ,vol. 2, pp. 1- 32 ,(1994) , 10.1613/JAIR.63
Roger Steven Berlind, An alternative method of stochastic discrimination with applications to pattern recognition State University of New York at Buffalo. ,(1995)
Tin Kam Ho, Random decision forests international conference on document analysis and recognition. ,vol. 1, pp. 278- 282 ,(1995) , 10.1109/ICDAR.1995.598994
A. Hoekstra, R.P.W. Duin, On the nonlinearity of pattern classifiers international conference on pattern recognition. ,vol. 4, pp. 271- 275 ,(1996) , 10.1109/ICPR.1996.547429
G. Toussaint, Bibliography on estimation of misclassification IEEE Transactions on Information Theory. ,vol. 20, pp. 472- 479 ,(1974) , 10.1109/TIT.1974.1055260
S.J. Raudys, A.K. Jain, Small sample size effects in statistical pattern recognition: recommendations for practitioners IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 13, pp. 252- 264 ,(1991) , 10.1109/34.75512
D.J. Hand, Recent advances in error rate estimation Pattern Recognition Letters. ,vol. 4, pp. 335- 346 ,(1986) , 10.1016/0167-8655(86)90054-1
J.M. Maciejowski, Model discrimination using an algorithmic information criterion Automatica. ,vol. 15, pp. 579- 593 ,(1979) , 10.1016/0005-1098(79)90006-2
Jerome H. Friedman, Lawrence C. Rafsky, Multivariate Generalizations of the Wald-Wolfowitz and Smirnov Two-Sample Tests Annals of Statistics. ,vol. 7, pp. 697- 717 ,(1979) , 10.1214/AOS/1176344722