Systems for big-graphs

作者: Arijit Khan , Sameh Elnikety

DOI: 10.14778/2733004.2733067

关键词:

摘要: Graphs have become increasingly important to represent highly-interconnected structures and schema-less data including the World Wide Web, social networks, knowledge graphs, genome scientific databases, medical government records. The massive scale of graph easily overwhelms main memory computation resources on commodity servers. In these cases, achieving low latency high throughput requires partitioning processing in parallel across a cluster However, software hardware advances that worked well for developing databases applications are not necessarily effective big-graph problems. Graph poses interesting system challenges: graphs relationships which usually irregular unstructured; therefore, access patterns poor locality. Hence, last few years has seen an unprecedented interest building systems big-graphs by various communities systems, semantic web, machine learning, operations research. this tutorial, we discuss design emerging big-graphs, key features distributed algorithms, as workload balancing techniques. We emphasize current challenges highlight some future research directions.

参考文章(17)
Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, Carlos Guestrin, None, PowerGraph: distributed graph-parallel computation on natural graphs operating systems design and implementation. pp. 17- 30 ,(2012) , 10.5555/2387880.2387883
Reynold S. Xin, Ion Stoica, Joseph E. Gonzalez, Daniel Crankshaw, Ankur Dave, Michael J. Franklin, GraphX: Unifying Data-Parallel and Graph-Parallel Analytics arXiv: Databases. ,(2014)
Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian-Lee Tan, Sai Wu, epiC Proceedings of the VLDB Endowment. ,vol. 7, pp. 541- 552 ,(2014) , 10.14778/2732286.2732291
Guy Blelloch, Aapo Kyrola, Carlos Guestrin, GraphChi: large-scale graph computation on just a PC operating systems design and implementation. ,vol. 2012, pp. 31- 46 ,(2012) , 10.5555/2387880.2387884
Arijit Khan, Yinghui Wu, Xifeng Yan, Emerging Graph Queries in Linked Data 2012 IEEE 28th International Conference on Data Engineering. pp. 1218- 1221 ,(2012) , 10.1109/ICDE.2012.143
ANDREW LUMSDAINE, DOUGLAS GREGOR, BRUCE HENDRICKSON, JONATHAN BERRY, CHALLENGES IN PARALLEL GRAPH PROCESSING Parallel Processing Letters. ,vol. 17, pp. 5- 20 ,(2007) , 10.1142/S0129626407002843
Amitabha Roy, Ivo Mihailovic, Willy Zwaenepoel, X-Stream: edge-centric graph processing using streaming partitions symposium on operating systems principles. pp. 472- 488 ,(2013) , 10.1145/2517349.2522740
Shengqi Yang, Xifeng Yan, Bo Zong, Arijit Khan, Towards effective partition management for large graphs Proceedings of the 2012 international conference on Management of Data - SIGMOD '12. pp. 517- 528 ,(2012) , 10.1145/2213836.2213895
Zuhair Khayyat, Karim Awara, Amani Alonazi, Hani Jamjoom, Dan Williams, Panos Kalnis, Mizan Proceedings of the 8th ACM European Conference on Computer Systems - EuroSys '13. pp. 169- 182 ,(2013) , 10.1145/2465351.2465369
Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, Martín Abadi, Naiad: a timely dataflow system symposium on operating systems principles. pp. 439- 455 ,(2013) , 10.1145/2517349.2522738