epiC

作者: Dawei Jiang , Gang Chen , Beng Chin Ooi , Kian-Lee Tan , Sai Wu

DOI: 10.14778/2732286.2732291

关键词:

摘要: The Big Data problem is characterized by the so called 3V features: Volume - a huge amount of data, Velocity high data ingestion rate, and Variety mix structured semi-structured unstructured data. state-of-the-art solutions to are largely based on MapReduce framework (aka its open source implementation Hadoop). Although Hadoop handles volume challenge successfully, it does not deal with variety well since programming interfaces associated processing model inconvenient inefficient for handling graph data.This paper presents epiC, an extensible system tackle Data's challenge. epiC introduces general Actor-like concurrent model, independent models, specifying parallel computations. Users process multi-structured datasets appropriate extensions, best suited type auxiliary code mapping that into epiC's model. Like Hadoop, programs written in this way can be automatically parallelized runtime takes care fault tolerance inter-machine communications. We present design also two customized optimized extension relational top epiC. Experiments demonstrate effectiveness efficiency our proposed

参考文章(36)
Masaru Kitsuregawa, Shinya Fushimi, Hidehiko Tanaka, An Overview of The System Software of A Parallel Relational Database Machine GRACE very large data bases. pp. 209- 219 ,(1986)
Rajeev Motwani, Terry Winograd, Lawrence Page, Sergey Brin, The PageRank Citation Ranking : Bringing Order to the Web the web conference. ,vol. 98, pp. 161- 172 ,(1999)
Semih Salihoglu, Jennifer Widom, GPS: a graph processing system statistical and scientific database management. pp. 22- ,(2013) , 10.1145/2484838.2484843
Guoliang Li, Dong Deng, Jiannan Wang, Jianhua Feng, Pass-join Proceedings of the VLDB Endowment. ,vol. 5, pp. 253- 264 ,(2011) , 10.14778/2078331.2078340
David DeWitt, Jim Gray, Parallel database systems Communications of the ACM. ,vol. 35, pp. 85- 98 ,(1992) , 10.1145/129888.129894
Robert Sedgewick, Jon L. Bentley, Fast algorithms for sorting and searching strings symposium on discrete algorithms. pp. 360- 369 ,(1997) , 10.5555/314161.314321
Dawei Jiang, Beng Chin Ooi, Lei Shi, Sai Wu, The performance of MapReduce Proceedings of the VLDB Endowment. ,vol. 3, pp. 472- 483 ,(2010) , 10.14778/1920841.1920903
Yingyi Bu, Bill Howe, Magdalena Balazinska, Michael D. Ernst, HaLoop Proceedings of the VLDB Endowment. ,vol. 3, pp. 285- 296 ,(2010) , 10.14778/1920841.1920881
Sai Wu, Feng Li, Sharad Mehrotra, Beng Chin Ooi, Query optimization for massively parallel data processing symposium on cloud computing. pp. 12- ,(2011) , 10.1145/2038916.2038928
Ahmad Ghazal, Tilmann Rabl, Minqing Hu, Francois Raab, Meikel Poess, Alain Crolotte, Hans-Arno Jacobsen, BigBench: towards an industry standard benchmark for big data analytics international conference on management of data. pp. 1197- 1208 ,(2013) , 10.1145/2463676.2463712