Structured learning for spatial information extraction from biomedical text: bacteria biotopes

作者: Parisa Kordjamshidi , Dan Roth , Marie-Francine Moens

DOI: 10.1186/S12859-015-0542-Z

关键词: Natural languageStructure (mathematical logic)Biomedical text miningSpatial analysisSpatial relationInformation retrievalTask (project management)Computer scienceWeb pageStructured prediction

摘要: We aim to automatically extract species names of bacteria and their locations from webpages. This task is important for exploiting the vast amount biological knowledge which expressed in diverse natural language texts putting this databases easy access by biologists. The challenging previous results are far below an acceptable level performance, particularly extraction localization relationships. Therefore, we design a new system such extractions, using framework structured machine learning techniques. model joint biomedical entities relationship. Our based on spatial role labeling (SpRL) designed understanding unrestricted text. extend SpRL discourse relations domain apply it BioNLP-ST 2013, BB-shared task. highlight main differences between general information scientific text focus work. exploit text’s structure global features. features substantially improve systems, achieving absolute improvement approximately 57 percent over F1 measure best experimental indicate that all relationships document outperforms extracts independently. significantly improves state-of-the-art has high potential be adopted other processing (NLP) tasks domain.

参考文章(30)
Parisa Kordjamshidi, Marie-Francine Moens, Martijn van Otterlo, None, Spatial Role Labeling: Task Definition and Annotation Scheme language resources and evaluation. pp. 413- 420 ,(2010)
Vivek Srikumar, Dan Roth, A Joint Model for Extended Semantic Role Labeling empirical methods in natural language processing. pp. 129- 139 ,(2011)
Lise Getoor, Ben Taskar, Introduction to statistical relational learning MIT Press. ,(2007)
Vasin Punyakanok, Dav Zimak, Wen-tau Yih, Dan Roth, Learning and inference over constrained output international joint conference on artificial intelligence. pp. 1124- 1129 ,(2005)
Sebastian Riedel, Andrew McCallum, Fast and Robust Joint Models for Biomedical Event Extraction empirical methods in natural language processing. pp. 1- 12 ,(2011)
Parisa Kordjamshidi, Marie-Francine Moens, Global machine learning for spatial ontology population Journal of Web Semantics. ,vol. 30, pp. 3- 21 ,(2015) , 10.1016/J.WEBSEM.2014.06.001
Jenny Rose Finkel, Christopher D. Manning, Nested Named Entity Recognition empirical methods in natural language processing. pp. 141- 150 ,(2009) , 10.3115/1699510.1699529
Parisa Kordjamshidi, Martijn Van Otterlo, Marie-Francine Moens, Spatial role labeling: Towards extraction of spatial relations from natural language ACM Transactions on Speech and Language Processing. ,vol. 8, pp. 4- ,(2011) , 10.1145/2050104.2050105
Yoshimasa Tsuruoka, Nhung T. H. Nguyen, Extracting Bacteria Biotopes with Semi-supervised Named Entity Recognition and Coreference Resolution meeting of the association for computational linguistics. pp. 94- 101 ,(2011)