作者: Benoît Habert , Adeline Nazarenko , Pierre Zweigenbaum , J. Bouaud
DOI:
关键词:
摘要: There is a constant need to extend and tune specialized vocabularies account for new words word usages. This paper addresses the issue of characterizing semantic class such words. We test hypothesis that analysis distribution in representative corpus, as obtained by robust NLP tools, can help identify with similar meanings, decide on most likely category given based categories its neighbors. report an experiment moderatesize corpus patient discharge summaries collected during MENELAS project, taking high-level axes SNOMED nomenclature, processing ZELLIG suite tools. attempt quantify extent which this process succeeds proposing correct while we vary several parameters method. The percentage correctly categorized (precision) ranges between 50 75 %, best (recall) 37 % whole categorization process. Categorization results are significantly above chance, but not sufficient fully-automated discuss possible uses further directions improvement.