作者: Sören Auer , Jan Demter , Michael Martin , Jens Lehmann
DOI: 10.1007/978-3-642-33876-2_31
关键词: RDF 、 Analytics 、 SPARQL 、 Memory footprint 、 Scalability 、 Data Web 、 Reuse 、 Data mining 、 Metadata registry 、 Computer science
摘要: One of the major obstacles for a wider usage web data is difficulty to obtain clear picture available datasets. In order reuse, link, revise or query dataset published on Web it important know structure, coverage and coherence data. such information we developed LODStats --- statement-stream-based approach gathering comprehensive statistics about datasets adhering Resource Description Framework (RDF). based declarative description statistical characteristics. Its main advantages over other approaches are smaller memory footprint significantly better performance scalability. We integrated with CKAN metadata registry obtained current state significant part Data Web.