Identifying synonymy between SNOMED clinical terms of varying length using distributional analysis of electronic health records.

作者: Martin Duneld , Aron Henriksson , Wendy Webber Chapman , Mike Conway

DOI:

关键词:

摘要: Medical terminologies and ontologies are important tools for natural language processing of health record narratives. To account the variability use, synonyms need to be stored in a semantic resource as textual instantiations concept. Developing such resources manually is, however, prohibitively expensive likely result low coverage. facilitate expedite process lexical development, distributional analysis large corpora provides powerful data-driven means (semi-)automatically identifying relations, including synonymy, between terms. In this paper, we demonstrate how corpus electronic records - MIMIC-II database can employed extract SNOMED CT preferred A distinctive feature our method is its ability identify synonymous relations terms varying length.

参考文章(17)
Thomas C. Rindflesch, Jonathan R. Nebeker, Doug Redd, Qing T. Zeng, Synonym, topic model and predicate-based query expansion for retrieving clinical documents. american medical informatics association annual symposium. ,vol. 2012, pp. 1050- 1059 ,(2012)
Magnus Sahlgren, Pentti Kanerva, Anders Holst, Permutations as a means to encode order in word space The 30th Annual Meeting of the Cognitive Science Society (CogSci'08), 23-26 July 2008, Washington D.C., USA. ,(2008)
Aron Henriksson, Martin Hassel, None, Optimizing the Dimensionality of Clinical Term Spaces for Improved Diagnosis Coding Support 4th International Louhi Workshop on Health Document Text Mining and Information Analysis Sydney, NSW, Australia, 11-12 February 2013. ,(2013)
Aron Henriksson, Hans Moen, Maria Skeppstedt, Ann-Marie Eklund, Vidas Daudaravičius, Martin Hassel, None, Synonym Extraction of Medical Terms from Clinical Text Using Combinations of Word Space Models semantic mining in biomedicine. pp. 10- 17 ,(2012) , 10.5167/UZH-64476
Satanjeev Banerjee, Ted Pedersen, The design, implementation, and use of the Ngram statistics package international conference on computational linguistics. pp. 370- 381 ,(2003) , 10.1007/3-540-36456-0_38
Fabio Ciravegna, José Iria, Ziqi Zhang, Christopher Brewster, A Comparative Evaluation of Term Recognition Algorithms language resources and evaluation. pp. 2108- 2111 ,(2008)
Jan Kristoferson, Pentti Kanerva, Anders Holst, Random indexing of text samples for latent semantic analysis conference cognitive science. ,vol. 22, ,(2000)
Trevor Cohen, Dominic Widdows, Roger W. Schvaneveldt, Peter Davies, Thomas C. Rindflesch, Discovering discovery patterns with predication-based Semantic Indexing Journal of Biomedical Informatics. ,vol. 45, pp. 1049- 1065 ,(2012) , 10.1016/J.JBI.2012.07.003