作者: Bahar Sateli , René Witte
DOI: 10.1007/978-3-319-25518-7_11
关键词:
摘要: We present an automatic workflow that performs text segmentation and entity extraction from scientific literature to primarily address Task 2 of the Semantic Publishing Challenge 2015. The goal is extract various information full-text papers represent context in which a document written, such as affiliation its authors corresponding funding bodies. Our proposed solution composed two subsystems: (i) A mining pipeline, developed based on GATE framework, extracts structural semantic entities, authors’ references, produces (typed) annotations; (ii) flexible exporting module, LODeXporter, translates annotations into RDF triples according custom mapping rules. Additionally, we leverage existing Named Entity Recognition (NER) tools named entities ground them their resources Linked Open Data cloud, thus, briefly covering 3 objectives, involves linking detected open datasets. output our system graph stored scalable TDB-based storage with public SPARQL endpoint for task’s queries.