Using Machine Learning Techniques to Identify Key Risk Factors for Diabetes and Undiagnosed Diabetes.

作者： Avraham Adler

DOI:

关键词:

摘要: This paper reviews a wide selection of machine learning models built to predict both the presence diabetes and undiagnosed using eight years National Health Nutrition Examination Survey (NHANES) data. Models are tuned compared via their Brier Scores. The most relevant variables best performing then compared. A Support Vector Machine with linear kernel performed for predicting diabetes, returning score 0.0654 an AUROC 0.9235 on test set. An elastic net regression 0.0294 0.9439 Similar features appear prominently in sets models. Blood osmolality, family history, prevalance various compounds, hypertension key indicators all risk. For particular, there ethnicity or genetic components which arise as strong correlates well.

arxiv.org 本地加速

uni-trier.de 本地加速

arxiv.org PDF 下载加速

参考文章(39)

Kjell Johnson, Max Kuhn, Applied Predictive Modeling ,(2013)

Steven L. Salzberg, Alberto Segre, Programs for Machine Learning ,(1994)

James Franklin, The elements of statistical learning : data mining, inference,and prediction The Mathematical Intelligencer. ,vol. 27, pp. 83- 85 ,(2005) , 10.1007/BF02985802

Jesse Davis, Mark Goadrich, The relationship between Precision-Recall and ROC curves Proceedings of the 23rd international conference on Machine learning - ICML '06. ,vol. 148, pp. 233- 240 ,(2006) , 10.1145/1143844.1143874

Reinhard Selten, Axiomatic Characterization of the Quadratic Scoring Rule Experimental Economics. ,vol. 1, pp. 43- 62 ,(1998) , 10.1007/BF01426214

Houtao Deng, George Runger, Feature selection via regularized trees international joint conference on neural network. pp. 1- 8 ,(2012) , 10.1109/IJCNN.2012.6252640

Fred S. Guthery, Kenneth P. Burnham, David R. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach The Journal of Wildlife Management. ,vol. 67, pp. 655- ,(2003) , 10.2307/3802723

N L Benowitz, Biomarkers of environmental tobacco smoke exposure. Environmental Health Perspectives. ,vol. 107, pp. 349- 355 ,(1999) , 10.1289/EHP.99107S2349

Shafi Habibi, Maryam Ahmadi, Somayeh Alizadeh, Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining. Global Journal of Health Science. ,vol. 7, pp. 304- 310 ,(2015) , 10.5539/GJHS.V7N5P304

10.

Jian-jun Dong, Neng-jun Lou, Jia-jun Zhao, Zhong-wen Zhang, Lu-lu Qiu, Ying Zhou, Lin Liao, Evaluation of a risk factor scoring model in screening for undiagnosed diabetes in China population Journal of Zhejiang University SCIENCE B. ,vol. 12, pp. 846- 852 ,(2011) , 10.1631/JZUS.B1000390

Using Machine Learning Techniques to Identify Key Risk Factors for Diabetes and Undiagnosed Diabetes.

来源期刊

我的账户

Using Machine Learning Techniques to Identify Key Risk Factors for Diabetes and Undiagnosed Diabetes.

来源期刊

相似文章 0

我的账户