Missing Data Imputation using Machine Learning Algorithm for Supervised Learning

作者: R. Vijaya Arjunan , D. Cenitta , Prema K

DOI: 10.1109/ICCCI50826.2021.9402558

关键词:

摘要: With a transience rate of over 18 million per year, Heart Disease (HD) has emerged out to be the lethal disease world. Data mining-based heart diagnosis systems can surely aid cardiac professionals in timely patient's condition. In this proposed work, Python-based data mining system capable diagnosing HD using Decision Tree been developed. methodology, UCI repository was taken into consideration with 14 Attributes. dataset, there are few missing values (yet found hyperparameter), and pre-processing such is common yet challenging problem. A mere substitution will give biased results from observed for certainly affect value learning process Machine Learning. Therefore, imputation done, which gave better accuracy, it trustable.

参考文章(11)
Filipe Portela, Manuel Filipe Santos, Alvaro Silva, Fernando Rua, Antonio Abelha, Jose Machado, Preventing patient Cardiac Arrhythmias by using data mining techniques ieee conference on biomedical engineering and sciences. pp. 165- 170 ,(2014) , 10.1109/IECBES.2014.7047478
Ankita Dewan, Meghna Sharma, Prediction of heart disease using a hybrid technique in data mining classification international conference on computing for sustainable global development. pp. 704- 706 ,(2015)
Fabio Lobato, Claudomiro Sales, Igor Araujo, Vincent Tadaiesky, Lilian Dias, Leonardo Ramos, Adamo Santana, Multi-objective genetic algorithm for missing data imputation Pattern Recognition Letters. ,vol. 68, pp. 126- 131 ,(2015) , 10.1016/J.PATREC.2015.08.023
Divya Tomar, Sonali Agarwal, A survey on Data Mining approaches for Healthcare bio science and bio technology. ,vol. 5, pp. 241- 266 ,(2013) , 10.14257/IJBSBT.2013.5.5.25
K. Deeba, B. Amutha, Classification Algorithms of Data Mining Indian journal of science and technology. ,vol. 9, ,(2016) , 10.17485/IJST/2016/V9I39/102065
S. Radhimeenakshi, Classification and prediction of heart disease risk using data mining techniques of Support Vector Machine and Artificial Neural Network international conference on computing for sustainable global development. pp. 3107- 3111 ,(2016)
Soodeh Nikan, Femida Gwadry-Sridhar, Michael Bauer, Machine Learning Application to Predict the Risk of Coronary Artery Atherosclerosis international conference on computational science. pp. 34- 39 ,(2016) , 10.1109/CSCI.2016.0014
Waseem Shahzad, Qamar Rehman, Ejaz Ahmed, None, Missing Data Imputation using Genetic Algorithm for Supervised Learning International Journal of Advanced Computer Science and Applications. ,vol. 8, ,(2017) , 10.14569/IJACSA.2017.080360
Cao Truong Tran, Mengjie Zhang, Peter Andreae, Bing Xue, Lam Thu Bui, Improving performance of classification on incomplete data using feature selection and clustering Applied Soft Computing. ,vol. 73, pp. 848- 861 ,(2018) , 10.1016/J.ASOC.2018.09.026