Scale-Out Processing of Large RDF Datasets

DOI: 10.1109/TBDATA.2015.2505719

关键词: Big data 、 Data mining 、 Computer science 、 RDF 、 Dynamic data 、 Distributed database 、 SPARQL 、 Cwm 、 RDF query language 、 Scalability

摘要: Distributed RDF data management systems become increasingly important with the growth of Semantic Web. Regardless, current methods meet performance bottlenecks either on loading or querying when processing large amounts data. In this work, we propose efficient for using dynamic re-partitioning to enable rapid analysis datasets. Our approach adopts a two-tier index architecture each computation node: (1) lightweight primary index, keep times low, and (2) series dynamic, multi-level secondary indexes, calculated as by-product query execution, decrease remove inter-machine movement subsequent queries that contain same graph patterns. addition, replace some indexes distributed filters, so memory consumption. Experimental results commodity cluster 16 nodes show method presents good scale-out characteristics can indeed vastly improve speeds while remaining competitive in terms performance. Specifically, our load dataset 1.1 billion triples at rate 2.48 million per second provide RDF-3X 4store expensive queries.

参考文章(49)

Nick Gibbins, mc schraefel, Alisdair Owens, Andy Seaborne, Clustered TDB: A Clustered Triple Store for Jena s.n.. ,(2008)

Laurens Rietveld, Rinke Hoekstra, Stefan Schlobach, Christophe Guéret, Structural Properties as Proxy for Semantic Relevance in RDF Graph Sampling The Semantic Web – ISWC 2014. ,vol. 8797, pp. 81- 96 ,(2014) , 10.1007/978-3-319-11915-1_6

Long Cheng, Spyros Kotoulas, Tomas E Ward, Georgios Theodoropoulos, Robust and Efficient Large-Large Table Outer Joins on Distributed Infrastructures Lecture Notes in Computer Science. pp. 258- 269 ,(2014) , 10.1007/978-3-319-09873-9_22

Jürgen Umbrich, Marcel Karnstedt, Aidan Hogan, Josiane Xavier Parreira, Hybrid SPARQL queries: fresh vs. fast results international semantic web conference. pp. 608- 624 ,(2012) , 10.1007/978-3-642-35176-1_38

José M. Giménez-García, Javier D. Fernández, Miguel A. Martínez-Prieto, HDT-MR: A Scalable Solution for RDF Compression with HDT and MapReduce european semantic web conference. pp. 253- 268 ,(2015) , 10.1007/978-3-319-18818-8_16

Hamid R. Bazoobandi, Steven de Rooij, Jacopo Urbani, Annette ten Teije, Frank van Harmelen, Henri Bal, A Compact In-Memory Dictionary for RDF Data european semantic web conference. pp. 205- 220 ,(2015) , 10.1007/978-3-319-18818-8_13

Orri Erling, Ivan Mikhailov, Virtuoso: RDF Support in a Native RDBMS swim. pp. 501- 519 ,(2010) , 10.1007/978-3-642-04329-1_21

Brian McBride, Jena: implementing the RDF model and syntax specification international semantic web conference. pp. 23- 28 ,(2001)

Eric L. Goodman, Edward Jimenez, David Mizell, Sinan al-Saffar, Bob Adolf, David Haglin, High-Performance Computing Applied to Semantic Databases The Semanic Web: Research and Applications. pp. 31- 45 ,(2011) , 10.1007/978-3-642-21064-8_3

10.

Javier D. Fernández, Miguel A. Martínez-Prieto, Claudio Gutierrez, Compact representation of large RDF data sets for publishing and exchange international semantic web conference. pp. 193- 208 ,(2010) , 10.1007/978-3-642-17746-0_13

Scale-Out Processing of Large RDF Datasets

来源期刊

我的账户

Scale-Out Processing of Large RDF Datasets

来源期刊

相似文章 10

我的账户