Tree-based classifier ensembles for early detection method of diabetes: an exploratory study

作者: Bayu Adhi Tama , Kyung-Hyune Rhee , None

DOI: 10.1007/S10462-017-9565-3

关键词: Decision treeArtificial intelligenceMachine learningLogistic model treePattern recognitionEnsemble learningBoosting (machine learning)Computer scienceRandom treeNaive Bayes classifierClassifier (UML)Incremental decision tree

摘要: Diabetes is a lifestyle-driven disease which has become critical health issue worldwide. In this paper, we conduct an exploratory study about early detection method of diabetes mellitus using various ensemble learning techniques. Eight tree-based machine algorithms, i.e. classification and regression tree, decision tree (C4.5), reduced error pruning random naive Bayes functional best-first logistic model are employed as base classifier in five different ensembles, bagging, boosting, subspace, DECORATE, rotation forest. The performance ensembles classifiers thoroughly benchmarked on three real-world datasets term area under receiver operating characteristic curve metric. Finally, assess the differences among several statistical significant tests. We contribute to existing literature regarding extensive benchmark for disease.

参考文章(35)
Rahman Ali, Muhammad Hameed Siddiqi, Muhammad Idris, Byeong Ho Kang, Sungyoung Lee, None, Prediction of diabetes mellitus based on boosting ensemble modeling ubiquitous computing. ,vol. 8867, pp. 25- 28 ,(2014) , 10.1007/978-3-319-13102-3_6
Ron Kohavi, Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid knowledge discovery and data mining. pp. 202- 207 ,(1996)
Janez Demšar, Statistical Comparisons of Classifiers over Multiple Data Sets Journal of Machine Learning Research. ,vol. 7, pp. 1- 30 ,(2006)
Richard A Olshen, Charles J Stone, Leo Breiman, Jerome H Friedman, Classification and regression trees ,(1983)
Niels Landwehr, Mark Hall, Eibe Frank, Logistic Model Trees Machine Learning. ,vol. 59, pp. 161- 205 ,(2005) , 10.1007/S10994-005-0466-3
Dursun Delen, Glenn Walker, Amit Kadam, Predicting breast cancer survivability: a comparison of three data mining methods Artificial Intelligence in Medicine. ,vol. 34, pp. 113- 127 ,(2005) , 10.1016/J.ARTMED.2004.07.002
Yoav Freund, Robert E Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting conference on learning theory. ,vol. 55, pp. 119- 139 ,(1997) , 10.1006/JCSS.1997.1504
Mgs Afriyan Firdaus, Rin Nadia, Bayu Adhi Tama, None, Detecting major disease in public hospital using ensemble techniques 2014 International Symposium on Technology Management and Emerging Technologies. pp. 149- 152 ,(2014) , 10.1109/ISTMET.2014.6936496
GianLuca Marcialis, Fabio Roli, Fusion of appearance-based face recognition algorithms Pattern Analysis and Applications. ,vol. 7, pp. 151- 163 ,(2004) , 10.1007/S10044-004-0212-7
Olive Jean Dunn, Multiple Comparisons Using Rank Sums Technometrics. ,vol. 6, pp. 241- 252 ,(1964) , 10.1080/00401706.1964.10490181