HadoopRDF: a scalable semantic data analytical engine

作者: Jin-Hang Du , Hao-Fen Wang , Yuan Ni , Yong Yu

DOI: 10.1007/978-3-642-31576-3_80

关键词:

摘要: With the rapid growth of scale semantic data, to handle problem analyzing this large-scale data has become a hot topic. Traditional triple stores deployed on single machine have been proved be effective provide storage and retrieval RDF data. However, scalability is limited cannot billion ever growing triples. On other hand, Hadoop an open-source project which provides HDFS as distributed file system MapReduce computing framework for processing. It perform well large analysis. In paper, we propose, HadoopRDF, combine both worlds (triple Hadoop) scalable analysis service benefits ability support flexible query like SPARQL traditional stores. Experimental evaluation results show effectiveness efficiency approach.

参考文章(13)
Radhika Sridhar, Padmashree Ravindra, Kemafor Anyanwu, RAPID: Enabling Scalable Ad-Hoc Analytics on the Semantic Web international semantic web conference. ,vol. 5823, pp. 715- 730 ,(2009) , 10.1007/978-3-642-04930-9_45
Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, Bhavani Thuraisingham, Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce international conference on cloud computing. ,vol. 5931, pp. 680- 686 ,(2009) , 10.1007/978-3-642-10665-1_72
Jeen Broekstra, Arjohn Kampman, Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema international semantic web conference. pp. 54- 68 ,(2002) , 10.1007/3-540-48005-6_7
K. Selçuk Candan, Huan Liu, Reshma Suvarna, Resource description framework ACM SIGKDD Explorations Newsletter. ,vol. 3, pp. 6- 19 ,(2001) , 10.1145/507533.507536
Christian Bizer, Tom Heath, Tim Berners-Lee, Linked Data - the story so far International Journal on Semantic Web and Information Systems. ,vol. 5, pp. 1- 22 ,(2009) , 10.4018/JSWIS.2009081901
Padmashree Ravindra, Vikas V. Deshpande, Kemafor Anyanwu, Towards scalable RDF graph analytics on MapReduce Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud. pp. 5- ,(2010) , 10.1145/1779599.1779604
Azza Abouzied, Kamil Bajda-Pawlikowski, Jiewen Huang, Daniel J. Abadi, Avi Silberschatz, HadoopDB in action Proceedings of the 2010 international conference on Management of data - SIGMOD '10. pp. 1111- 1114 ,(2010) , 10.1145/1807167.1807294
Christian Bizer, Andreas Schultz, The Berlin SPARQL benchmark International Journal on Semantic Web and Information Systems. ,vol. 5, pp. 1- 24 ,(2009) , 10.4018/JSWIS.2009040101
Jaeseok Myung, Jongheum Yeon, Sang-goo Lee, SPARQL basic graph pattern processing with iterative MapReduce Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud. pp. 6- ,(2010) , 10.1145/1779599.1779605
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins, Pig latin Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD '08. pp. 1099- 1110 ,(2008) , 10.1145/1376616.1376726