Finding undiagnosed patients with hepatitis C infection: an application of artificial intelligence to patient claims data.

作者: Orla M. Doyle , Nadejda Leavitt , John A. Rigg

DOI: 10.1038/S41598-020-67013-6

关键词: Context (language use)Medical historyMedicineHepatitis C virusPublic healthHepatitis CRandom forestArtificial intelligenceLogistic regressionRetrospective cohort study

摘要: Hepatitis C virus (HCV) remains a significant public health challenge with approximately half of the infected population untreated and undiagnosed. In this retrospective study, predictive models were developed to identify undiagnosed HCV patients using longitudinal medical claims linked prescription data from ten million in United States (US) between 2010 2016. Features capturing information on demographics, risk factors, symptoms, treatments procedures relevant extracted patients' history. Predictive algorithms based logistic regression, random forests, gradient boosted trees stacked ensemble. Descriptive analysis indicated that exhibited known symptoms average 2-3 years prior their diagnosis. The precision was at least 95% for all low levels recall (10%). For >50%, ensemble performed best 97% compared 87% just 31% regression. context, Center Disease Control recommends screening an at-risk sub-population estimated prevalence 2.23%. artificial intelligence (AI) algorithm presented here has which is substantially higher than rates associated recommended clinical guidelines, suggesting AI have potential provide step change effectiveness screening.

参考文章(36)
David H. Wolpert, Original Contribution: Stacked generalization Neural Networks. ,vol. 5, pp. 241- 259 ,(1992) , 10.1016/S0893-6080(05)80023-1
Bryce D Smith, Rebecca L Morgan, Geoff A Beckett, Yngve Falck-Ytter, Deborah Holtzman, Chong-Gee Teo, Amy Jewett, Brittney Baack, David B Rein, Nita Patel, Miriam Alter, Anthony Yartel, John W Ward, Centers for Disease Control and Prevention, None, Recommendations for the identification of chronic hepatitis C virus infection among persons born during 1945-1965. MMWR. Recommendations and reports : Morbidity and mortality weekly report. Recommendations and reports / Centers for Disease Control. ,vol. 61, pp. 1- 32 ,(2012)
Brian R. Edlin, Benjamin J. Eckhardt, Marla A. Shu, Scott D. Holmberg, Tracy Swan, Toward a More Accurate Estimate of the Prevalence of Hepatitis C in the United States Hepatology. ,vol. 62, pp. 1353- 1363 ,(2015) , 10.1002/HEP.27978
Jerome H. Friedman, Greedy function approximation: A gradient boosting machine. Annals of Statistics. ,vol. 29, pp. 1189- 1232 ,(2001) , 10.1214/AOS/1013203451
Baligh R. Yehia, Asher J. Schranz, Craig A. Umscheid, Vincent Lo Re, The Treatment Cascade for Chronic Hepatitis C Virus Infection in the United States: A Systematic Review and Meta-Analysis PLoS ONE. ,vol. 9, pp. e101554- ,(2014) , 10.1371/JOURNAL.PONE.0101554
R. L. Koretz, K. W. Lin, J. P. A. Ioannidis, J. Lenzer, Is widespread screening for hepatitis C justified BMJ. ,vol. 350, pp. 126- 129 ,(2015) , 10.1136/BMJ.G7809
Ron Kohavi, George H. John, Wrappers for feature subset selection Artificial Intelligence. ,vol. 97, pp. 273- 324 ,(1997) , 10.1016/S0004-3702(97)00043-X
Jonathan Moorman, Mustafa Saad, Semaan Kosseifi, Guha Krishnaswamy, Hepatitis C virus and the lung: implications for therapy. Chest. ,vol. 128, pp. 2882- 2892 ,(2005) , 10.1378/CHEST.128.4.2882
Xu-ying Liu, Jianxin Wu, Zhi-hua Zhou, Exploratory Under-Sampling for Class-Imbalance Learning international conference on data mining. pp. 965- 969 ,(2006) , 10.1109/ICDM.2006.68