Speeding up AdaBoost Classifier with Random Projection

作者: Biswajit Paul , G. Athithan , M. Narasimha Murty

DOI: 10.1109/ICAPR.2009.67

关键词:

摘要: The development of techniques for scaling up classifiers so that they can be applied to problems with large datasets training examples is one the objectives data mining. Recently, AdaBoost has become popular among machine learning community thanks its promising results across a variety applications. However, on major problem, especially when dimensionality very high. This paper discusses effect high process AdaBoost. Two preprocessing options reduce dimensionality, namely principal component analysis and random projection are briefly examined. Random subject probabilistic length preserving transformation explored further as computationally light step. experimental obtained demonstrate effectiveness proposed handling dimensional datasets.

参考文章(13)
Salvatore J. Stolfo, Philip K. Chan, Learning arbiter and combiner trees from partitioned data for scaling machine learning knowledge discovery and data mining. pp. 39- 44 ,(1995)
Steven L. Salzberg, Alberto Segre, Programs for Machine Learning ,(1994)
Rosa I. Arriaga, Santosh Vempala, An algorithmic theory of learning: robust concepts and random projection foundations of computer science. ,vol. 63, pp. 161- 182 ,(1999) , 10.1007/S10994-006-6265-7
Carlos Domingo, Osamu Watanabe, Scaling Up a Boosting-Based Learner via Adaptive Sampling pacific asia conference on knowledge discovery and data mining. pp. 317- 328 ,(2000) , 10.1007/3-540-45571-X_37
Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting conference on learning theory. ,vol. 55, pp. 119- 139 ,(1997) , 10.1006/JCSS.1997.1504
Dimitris Achlioptas, Database-friendly random projections symposium on principles of database systems. pp. 274- 281 ,(2001) , 10.1145/375551.375608
Dmitriy Fradkin, David Madigan, Experiments with random projections for machine learning knowledge discovery and data mining. pp. 517- 522 ,(2003) , 10.1145/956750.956812
Robert Bryll, Ricardo Gutierrez-Osuna, Francis Quek, Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets Pattern Recognition. ,vol. 36, pp. 1291- 1302 ,(2003) , 10.1016/S0031-3203(02)00121-8
J. Ross Quinlan, C4.5: Programs for Machine Learning ,(1992)