作者: Avraham Adler
DOI:
关键词:
摘要: This paper reviews a wide selection of machine learning models built to predict both the presence diabetes and undiagnosed using eight years National Health Nutrition Examination Survey (NHANES) data. Models are tuned compared via their Brier Scores. The most relevant variables best performing then compared. A Support Vector Machine with linear kernel performed for predicting diabetes, returning score 0.0654 an AUROC 0.9235 on test set. An elastic net regression 0.0294 0.9439 Similar features appear prominently in sets models. Blood osmolality, family history, prevalance various compounds, hypertension key indicators all risk. For particular, there ethnicity or genetic components which arise as strong correlates well.