A novel application of Hoeffding's inequality to decision trees construction for data streams

作者: Piotr Duda , Maciej Jaworski , Lena Pietruczuk , Leszek Rutkowski

DOI: 10.1109/IJCNN.2014.6889806

关键词: Data stream miningHoeffding's inequalityCritical point (mathematics)Data streamMathematicsApproximation algorithmDecision treeMathematical optimization

摘要: Decision trees are the commonly applied tools in task of data stream classification. The most critical point decision tree construction algorithm is choice splitting attribute. In majority algorithms existing literature criterion based on statistical bounds derived for split measure functions. this paper we propose a totally new kind criterion. We derive arguments function instead deriving it itself. This approach allows us to properly use Hoeffding's inequality obtain required bounds. Based theoretical results Trees Fractions Approximation (DTFA). exhibits satisfactory classification accuracy numerical experiments. It also compared with other methods, demonstrating noticeably better performance.

参考文章(25)
Pawel Matuszyk, Georg Krempl, Myra Spiliopoulou, Correcting the Usage of the Hoeffding Inequality in Stream Mining Advances in Intelligent Data Analysis XII. pp. 298- 309 ,(2013) , 10.1007/978-3-642-41398-8_26
Bernhard Pfahringer, Geoffrey Holmes, Richard Kirkby, New options for hoeffding trees australasian joint conference on artificial intelligence. pp. 90- 99 ,(2007) , 10.1007/978-3-540-76928-6_11
João Gama, Ricardo Fernandes, Ricardo Rocha, Decision trees for mining data streams intelligent data analysis. ,vol. 10, pp. 23- 45 ,(2006) , 10.3233/IDA-2006-10103
Bartosz A. Nowak, Robert K. Nowicki, Wojciech K. Mleczko, A New Method of Improving Classification Accuracy of Decision Tree in Case of Incomplete Samples international conference on artificial intelligence and soft computing. ,vol. 7894, pp. 448- 458 ,(2013) , 10.1007/978-3-642-38658-9_40
Indre Zliobaite, Albert Bifet, Bernhard Pfahringer, Geoffrey Holmes, Active Learning With Drifting Streaming Data IEEE Transactions on Neural Networks. ,vol. 25, pp. 27- 39 ,(2014) , 10.1109/TNNLS.2012.2236570
R. P. Jagadeesh Chandra Bose, Wil M. P. van der Aalst, Indre Zliobaite, Mykola Pechenizkiy, Dealing With Concept Drifts in Process Mining IEEE Transactions on Neural Networks. ,vol. 25, pp. 154- 171 ,(2014) , 10.1109/TNNLS.2013.2278313
Ludmila I. Kuncheva, William J. Faithfull, PCA Feature Extraction for Change Detection in Multidimensional Unlabeled Data IEEE Transactions on Neural Networks. ,vol. 25, pp. 69- 80 ,(2014) , 10.1109/TNNLS.2013.2248094
Dariusz Brzezinski, Jerzy Stefanowski, Reacting to Different Types of Concept Drift: The Accuracy Updated Ensemble Algorithm IEEE Transactions on Neural Networks. ,vol. 25, pp. 81- 94 ,(2014) , 10.1109/TNNLS.2013.2251352
Jing Liu, Xue Li, Weicai Zhong, Ambiguous decision trees for mining concept-drifting data streams Pattern Recognition Letters. ,vol. 30, pp. 1347- 1355 ,(2009) , 10.1016/J.PATREC.2009.07.017