An Automatic Workflow for the Formalization of Scholarly Articles’ Structural and Semantic Elements

作者: Bahar Sateli , René Witte

DOI: 10.1007/978-3-319-46565-4_24

关键词:

摘要: We present a workflow for the automatic transformation of scholarly literature to Linked Open Data (LOD) compliant knowledge base address Task 2 Semantic Publishing Challenge 2016. In this year’s task, we aim extract various contextual information from full-text papers using text mining pipeline that integrates LOD-based Named Entity Recognition (NER) and triplification detected entities. our proposed approach, leverage an existing NER tool ground named entities, such as geographical locations, their LOD resources. Combined with rule-based demonstrate how can both structural (e.g., floats sections) semantic elements authors respective affiliations) provided dataset’s documents. Finally, integrate LODeXporter, flexible exporting module represent results triples in RDF format. As result, generate scalable, TDB-based is interlinked cloud, public SPARQL endpoint task’s queries. Our submission won second place at SemPub2016 challenge average 0.63 F-score.

参考文章(7)
Kalina Bontcheva, Hamish Cunningham, Valentin Tablan, Diana Maynard, A framework and graphical development environment for robust NLP tools and applications. meeting of the association for computational linguistics. pp. 168- 175 ,(2002)
Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, Zachary Ives, Sören Auer, Christian Bizer, DBpedia: a nucleus for a web of open data international semantic web conference. ,vol. 4825, pp. 722- 735 ,(2007) , 10.1007/978-3-540-76298-0_52
Bahar Sateli, René Witte, Automatic Construction of a Semantic Knowledge Base from CEUR Workshop Proceedings extended semantic web conference. ,vol. 548, pp. 129- 141 ,(2015) , 10.1007/978-3-319-25518-7_11
Alexandru Constantin, Steve Pettifer, Andrei Voronkov, PDFX: fully-automated PDF-to-XML conversion of scientific literature document engineering. pp. 177- 180 ,(2013) , 10.1145/2494266.2494271
Pablo N. Mendes, Max Jakob, Andrés García-Silva, Christian Bizer, DBpedia spotlight Proceedings of the 7th International Conference on Semantic Systems - I-Semantics '11. pp. 1- 8 ,(2011) , 10.1145/2063518.2063519
Silvio Peroni, David Shotton, Fabio Vitali, Faceted documents Proceedings of the 2012 ACM symposium on Document engineering - DocEng '12. pp. 191- 194 ,(2012) , 10.1145/2361354.2361396