LODStats --- an extensible framework for high-performance dataset analytics

作者: Sören Auer , Jan Demter , Michael Martin , Jens Lehmann

DOI: 10.1007/978-3-642-33876-2_31

关键词: RDFAnalyticsSPARQLMemory footprintScalabilityData WebReuseData miningMetadata registryComputer science

摘要: One of the major obstacles for a wider usage web data is difficulty to obtain clear picture available datasets. In order reuse, link, revise or query dataset published on Web it important know structure, coverage and coherence data. such information we developed LODStats --- statement-stream-based approach gathering comprehensive statistics about datasets adhering Resource Description Framework (RDF). based declarative description statistical characteristics. Its main advantages over other approaches are smaller memory footprint significantly better performance scalability. We integrated with CKAN metadata registry obtained current state significant part Data Web.

参考文章(12)
Richard Cyganiak, Jun Zhao, Michael Hausenblas, Keith Alexander, Describing Linked Datasets. LDOW. ,(2009)
Axel-Cyrille Ngonga Ngomo, Sören Auer, LIMES: a time-efficient approach for large-scale link discovery on the web of data international joint conference on artificial intelligence. pp. 2312- 2317 ,(2011) , 10.5591/978-1-57735-516-8/IJCAI11-385
Julius Volz, Christian Bizer, Martin Gaedke, Georgi Kobilarov, Discovering and Maintaining Links on the Web of Data international semantic web conference. ,vol. 5823, pp. 650- 665 ,(2009) , 10.1007/978-3-642-04930-9_41
Andre Bolles, Marco Grawunder, Jonas Jacobi, Streaming SPARQL - Extending SPARQL to Process Data Streams Lecture Notes in Computer Science. pp. 448- 462 ,(2008) , 10.1007/978-3-540-68234-9_34
Darko Anicic, Paul Fodor, Sebastian Rudolph, Nenad Stojanovic, EP-SPARQL Proceedings of the 20th international conference on World wide web - WWW '11. pp. 635- 644 ,(2011) , 10.1145/1963405.1963495
David Beckett, The design and implementation of the Redland RDF application framework Computer Networks. ,vol. 39, pp. 577- 588 ,(2002) , 10.1016/S1389-1286(02)00221-9
Giovanni Tummarello, Renaud Delbru, Stéphane Campinas, Krisztian Balog, Diego Ceccarelli, Thomas E. Perry, The Sindice-2011 Dataset for Entity-Oriented Search in the Web of Data ACM. ,(2011)
Michael Grossniklaus, Davide Francesco Barbieri, Daniele Braga, Stefano Ceri, Emanuele Della Valle, Querying RDF streams with C-SPARQL ACM SIGMOD Record. ,vol. 39, pp. 20- 26 ,(2010) , 10.1145/1860702.1860705
Andreas Langegger, Wolfram Woss, RDFStats - An Extensible RDF Statistics Generator and Library database and expert systems applications. pp. 79- 83 ,(2009) , 10.1109/DEXA.2009.25
The Semantic Web - ISWC 2009 Springer Berlin Heidelberg. ,(2009) , 10.1007/978-3-642-04930-9