作者: Lionel T. E. Cheng , Jiaping Zheng , Guergana K. Savova , Bradley J. Erickson
DOI: 10.1007/S10278-009-9215-7
关键词:
摘要: Information in electronic medical records is often an unstructured free-text format. This format presents challenges for expedient data retrieval and may fail to convey important findings. Natural language processing (NLP) emerging technique rapid efficient clinical retrieval. While proven disease detection, the utility of NLP discerning progression from reports untested. We aimed (1) assess whether radiology contained sufficient information tumor status classification; (2) develop NLP-based extraction tool determine reports; (3) compare human classification outcomes. Consecutive follow-up brain magnetic resonance imaging (2000–2007) a tertiary center were manually annotated using consensus guidelines on status. Reports randomized training (70%) or testing (30%) groups. The utilized support vector machines model with statistical rule-based Most had classification, although 0.8% did not describe despite reference prior examinations. Tumor size was unreported 68.7% documents, while 50.3% lacked change magnitude when there detectable regression. Using retrospective as gold standard, achieved 80.6% sensitivity 91.6% specificity determination (mean positive predictive value, 82.4%; negative 92.0%). In conclusion, most determination, though variable features used demonstrated good accuracy have novel application automated databases.