A shared task involving multi-label classification of clinical free text

作者: John P. Pestian , Christopher Brew , Paweł Matykiewicz , D. J. Hovermale , Neil Johnson

DOI: 10.3115/1572392.1572411

关键词:

摘要: This paper reports on a shared task involving the assignment of ICD-9-CM codes to radiology reports. Two features distinguished this from previous tasks in biomedical domain. One is that it resulted first freely distributable corpus fully anonymized clinical text. resource permanently available and will (we hope) facilitate future research. The other key feature required categorization with respect large commercially significant set labels. number participants was larger than any challenge task. We describe data production process evaluation measures, give preliminary analysis results. Many systems performed at levels approaching inter-coder agreement, suggesting human-like performance within reach currently technologies.

参考文章(14)
Ozlem Uzuner, Second i2b2 workshop on natural language processing challenges for clinical records. american medical informatics association annual symposium. pp. 1252- ,(2008)
John Pestian, Łukasz Itert, Włodzisław Duch, Development of a Pediatric Text-Corpus for Part-of-Speech Tagging intelligent information systems. pp. 219- 226 ,(2004) , 10.1007/978-3-540-39985-8_23
George Hripcsak, Matthew Scotch, Stephen B. Johnson, Peter D. Stetson, The sublanguage of cross-coverage. american medical informatics association annual symposium. pp. 742- 746 ,(2002)
Hooshang Kangarloo, Paul S. Cho, Ricky K. Taira, Text Boundary Detection of Medical Reports. american medical informatics association annual symposium. pp. 998- 998 ,(2002)
Tawanda Sibanda, Ozlem Uzuner, Role of Local Context in Automatic Deidentification of Ungrammatical, Fragmented Text language and technology conference. pp. 65- 73 ,(2006) , 10.3115/1220835.1220844
Carol Friedman, Pauline Kra, Andrey Rzhetsky, Two biomedical sublanguages: a description based on the theories of Zellig Harris Journal of Biomedical Informatics. ,vol. 35, pp. 222- 235 ,(2002) , 10.1016/S1532-0464(03)00012-1
John P. Pestian, Lukasz Itert, Charlotte Anderson, Wlodzislaw Duch, Preparing Clinical Text for Use in Biomedical Research Journal of Database Management. ,vol. 17, pp. 1- 11 ,(2006) , 10.4018/JDM.2006040101
J. C. Gower, P. Legendre, Metric and Euclidean properties of dissimilarity coefficients Journal of Classification. ,vol. 3, pp. 5- 48 ,(1986) , 10.1007/BF01896809