Using Natural Language Processing to Improve Efficiency of Manual Chart Abstraction in Research: The Case of Breast Cancer Recurrence

作者: David S. Carrell , Scott Halgrim , Diem-Thy Tran , Diana S. M. Buist , Jessica Chubak

DOI: 10.1093/AJE/KWT441

关键词: Health careReference standardsArtificial intelligenceBreast cancer recurrenceCancer recurrenceChart AbstractionBreast cancerProgress noteMedicineCancerNatural language processing

摘要: The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden manual abstraction in studies examining outcomes, like cancer recurrence, are documented unstructured text, such as progress notes, radiology reports, and pathology reports. developed an NLP-based system using open-source software to process notes 1995 2012 women with early-stage incident breast cancers identify whether when recurrences were diagnosed. evaluated 1,472 patients receiving EHR-documented care integrated Pacific Northwest. A separate study provided patient-level reference standard recurrence status date. correctly identified 92% estimated diagnosis dates within 30 days 88% these. Specificity was 96%. overlooked 5 65 recurrences, 4 because documents unavailable. other incorrectly classified nonrecurrent standard. If used similar cohorts, NLP by 90% number EHR charts abstracted confirmed cases at a rate comparable traditional abstraction.

参考文章(37)
Christopher G. Chute, Guergana K. Savova, Jiaping Zheng, Sean P. Murphy, Zi Ye, Jin Fan, Iftikhar Jan Kullo, Discovering Peripheral Arterial Disease Cases from Radiology Notes Using Natural Language Processing american medical informatics association annual symposium. ,vol. 2010, pp. 722- 726 ,(2010)
John P. Pestian, Louise Deleger, Guergana K. Savova, Judith W. Dexheimer, Imre Solti, Natural Language Processing – The Basics Springer Netherlands. pp. 149- 172 ,(2012) , 10.1007/978-94-007-5149-1_9
Guergana Savova, John Pestian, Brian Connolly, Timothy Miller, Yizhao Ni, Judith W. Dexheimer, Natural Language Processing: Applications in Pediatric Research Translational Bioinformatics. pp. 231- 250 ,(2016) , 10.1007/978-981-10-1104-7_12
G. K. Savova, J. E. Olson, S. P. Murphy, V. L. Cafourek, F. J. Couch, M. P. Goetz, J. N. Ingle, V. J. Suman, C. G. Chute, R. M. Weinshilboum, Automated discovery of drug treatment patterns for endocrine therapy of breast cancer within an electronic medical record Journal of the American Medical Informatics Association. ,vol. 19, ,(2012) , 10.1136/AMIAJNL-2011-000295
George Hripcsak, Carol Friedman, Philip O Alderson, William DuMouchel, Stephen B Johnson, Paul D Clayton, Unlocking Clinical Data from Narrative Reports: A Study of Natural Language Processing Annals of Internal Medicine. ,vol. 122, pp. 681- 688 ,(1995) , 10.7326/0003-4819-122-9-199505010-00007
Wendy Webber Chapman, Marcelo Fizman, Brian E Chapman, Peter J Haug, A Comparison of Classification Algorithms to Automatically Identify Chest X-Ray Reports That Support Pneumonia Journal of Biomedical Informatics. ,vol. 34, pp. 4- 14 ,(2001) , 10.1006/JBIN.2001.1000
Jessica Chubak, Diana S. M. Buist, Denise M. Boudreau, Mary Anne Rossing, Thomas Lumley, Noel S. Weiss, Breast cancer recurrence risk in relation to antidepressant use after diagnosis Breast Cancer Research and Treatment. ,vol. 112, pp. 123- 132 ,(2008) , 10.1007/S10549-007-9828-9
Harvey J. Murff, Fern FitzHenry, Michael E. Matheny, Nancy Gentry, Kristen L. Kotter, Kimberly Crimin, Robert S. Dittus, Amy K. Rosen, Peter L. Elkin, Steven H. Brown, Theodore Speroff, Automated Identification of Postoperative Complications Within an Electronic Medical Record Using Natural Language Processing JAMA. ,vol. 306, pp. 848- 855 ,(2011) , 10.1001/JAMA.2011.1204
Jessica Chubak, Gaia Pocobelli, Noel S. Weiss, Tradeoffs between accuracy measures for electronic health care data algorithms. Journal of Clinical Epidemiology. ,vol. 65, pp. 343- 349 ,(2012) , 10.1016/J.JCLINEPI.2011.09.002