An improved Id3 algorithm for medical data classification

作者: Shuo Yang , Jing-Zhi Guo , Jun-Wei Jin

DOI: 10.1016/J.COMPELECENG.2017.08.005

关键词: Artificial intelligenceWord error rateComputer scienceData classificationC4.5 algorithmClassifier (UML)Decision stumpID3 algorithmStatistical classificationRandom treeMachine learningData mining

摘要: Abstract Data mining techniques play an important role in clinical decision making, which provides physicians with accurate, reliable and quick predictions through building different models. This paper presents improved classification approach for the prediction of diseases based on classical Iterative Dichotomiser 3 (Id3) algorithm. The Id3 algorithm overcomes multi-value bias problem when selecting test/split attributes, solves issue numeric attribute discretization stores classifier model form rules by using a heuristic strategy easy understanding memory savings. Experiment results show that is superior to current four algorithms (J48, Decision Stump, Random Tree Id3) terms accuracy, stability minor error rate.

参考文章(28)
Tomasz Imielinski, Arun N. Swami, Balakrishna R. Iyer, Rakesh Agrawal, Sakti P. Ghosh, An Interval Classifier for Database Mining Applications very large data bases. pp. 560- 573 ,(1992)
Manish Mehta, Rakesh Agrawal, Jorma Rissanen, SLIQ: A fast scalable classifier for data mining Advances in Database Technology — EDBT '96. pp. 18- 32 ,(1996) , 10.1007/BFB0014141
Minqing Hu, Wynne Hsu, Bing Liu, Intuitive Representation of Decision Trees Using General Rules and Exceptions national conference on artificial intelligence. pp. 615- 620 ,(2000)
H. Wang, C. Zaniolo, CMP: a fast decision tree classifier using multivariate predictions international conference on data engineering. pp. 449- 460 ,(2000) , 10.1109/ICDE.2000.839444
Daiki Kobayashi, Osamu Takahashi, Hiroko Arioka, Shinichiro Koga, Tsuguya Fukui, A prediction rule for the development of delirium among patients in medical wards: Chi-Square Automatic Interaction Detector (CHAID) decision tree analysis model. American Journal of Geriatric Psychiatry. ,vol. 21, pp. 957- 962 ,(2013) , 10.1016/J.JAGP.2012.08.009
Manjeevan Seera, Chee Peng Lim, Shing Chiang Tan, Chu Kiong Loo, A hybrid FAM---CART model and its application to medical data classification Neural Computing and Applications. ,vol. 26, pp. 1799- 1811 ,(2015) , 10.1007/S00521-015-1852-9
G. Mahendran, R. Dhanasekaran, Investigation of the severity level of diabetic retinopathy using supervised classifier algorithms Computers & Electrical Engineering. ,vol. 45, pp. 312- 323 ,(2015) , 10.1016/J.COMPELECENG.2015.01.013
Yen-Liang Chen, Chang-Ling Hsu, Shih-Chieh Chou, Constructing a multi-valued and multi-labeled decision tree Expert Systems With Applications. ,vol. 25, pp. 199- 209 ,(2003) , 10.1016/S0957-4174(03)00047-2
Mateusz Budnik, Bartosz Krawczyk, On optimal settings of classification tree ensembles for medical decision support. Health Informatics Journal. ,vol. 19, pp. 3- 15 ,(2013) , 10.1177/1460458212446096