Optimizing RDF(S) queries on cloud platforms

作者: HyeongSik Kim , Padmashree Ravindra , Kemafor Anyanwu

DOI: 10.1145/2487788.2487917

关键词: Data modelingCost effectivenessComputer scienceSPARQLRDFCloud computingDatabaseAnalyticsData modelSemantic WebWorkflow

摘要: Scalable processing of Semantic Web queries has become a critical need given the rapid upward trend in availability data. The MapReduce paradigm is emerging as platform choice for large scale data and analytics due to its ease use, cost effectiveness, potential unlimited scaling. Processing on triple models challenge mainstream called Apache Hadoop, extensions such Pig Hive. This because require numerous joins which leads lengthy expensive workflows. Further, this paradigm, cloud resources are acquired demand traditional join optimization machinery statistics indexes often absent or not easily supported.In demonstration, we will present RAPID+, an extended system that uses algebraic approach optimizing RDF including involving inferencing. basic idea by using logical physical operators more natural processing, can reinterpret way concise execution workflows small intermediate footprints minimize disk I/Os network transfer overhead. RAPID+ evaluates Nested TripleGroup Data Model Algebra(NTGA). demo show comparative performance NTGA query plans vs. relational algebra-like used

参考文章(5)
Padmashree Ravindra, HyeongSik Kim, Kemafor Anyanwu, An Intermediate Algebra for Optimizing RDF Graph Pattern Matching on MapReduce The Semanic Web: Research and Applications. pp. 46- 61 ,(2011) , 10.1007/978-3-642-21064-8_4
Heiner Stuckenschmidt, Jeen Broekstra, Time – space trade-offs in scaling up RDF schema reasoning web information systems engineering. pp. 172- 181 ,(2005) , 10.1007/11581116_18
Wangchao Le, Anastasios Kementsietsidis, Songyun Duan, Feifei Li, Scalable Multi-query Optimization for SPARQL 2012 IEEE 28th International Conference on Data Engineering. pp. 666- 677 ,(2012) , 10.1109/ICDE.2012.37
Jeffrey Dean, Sanjay Ghemawat, MapReduce Communications of the ACM. ,vol. 51, pp. 107- 113 ,(2008) , 10.1145/1327452.1327492
HyeongSik Kim, Padmashree Ravindra, Kemafor Anyanwu, From SPARQL to MapReduce Proceedings of the VLDB Endowment. ,vol. 4, pp. 1426- 1429 ,(2011) , 10.14778/3402755.3402787