Efficient and Simplified Parallel Graph Processing over CPU and MIC

作者: Linchuan Chen , Xin Huo , Bin Ren , Surabhi Jain , Gagan Agrawal

DOI: 10.1109/IPDPS.2015.88

关键词:

摘要: Intel Xeon Phi (MIC architecture) is a relatively new accelerator chip, which combines large-scale shared memory parallelism with wide SIMD lanes. Mapping applications on anode such an architecture to achieve high parallel efficiency's major challenge. In this paper, we focus developing system for heterogeneous graph processing, able utilize both many-core and multi-core CPU ozone node. We propose simple programming API unintuitive interface expressing parallelism. develop efficient techniques supporting our high-level API, focusing exploiting lanes, massive number of cores, partitioning the work across accelerator, while handling irregularity applications. The components runtime include condensed static buffer, supports message insertion reduction keeping requirements low, specifically formic, pipelining scheme generation by avoiding frequent locking operations. Besides, hybrid module effectively partition workload between MIC, ensuring balanced low communication overhead. main observations from experimental evaluation using five popular are: formic executions, up 3.36x faster than naive approach based generation, speedup over OpenMP ranges 1.17 4.15. Heterogeneous-MIC execution achieves 1.41 better CPU-only MIC-only executions.

参考文章(41)
Jiyang Chen, Randy Goebel, Osmar R. Zaïane, Detecting Communities in Social Networks Using Max-Min Modularity. siam international conference on data mining. pp. 978- 989 ,(2009)
Siegfried Benkner, Martin Sandrieser, Sabri Pllana, Enes Bajrovic, Beverly Bachmayer, Jiri Dokulil, Efficient Hybrid Execution of C++ Applications using Intel(R) Xeon Phi(TM) Coprocessor arXiv: Distributed, Parallel, and Cluster Computing. ,(2012)
Tieyun Qian, Jaideep Srivastava, Zhiyong Peng, Phillip C. Y. Sheu, Simultaneously Finding Fundamental Articles and New Topics Using a Community Tracking Method Advances in Knowledge Discovery and Data Mining. pp. 796- 803 ,(2009) , 10.1007/978-3-642-01307-2_82
Danny Bickson, Aapo Kyrola, Carlos Guestrin, Joseph Hellerstein, Yucheng Low, Joseph Gonzalez, GraphLab: a new framework for parallel machine learning uncertainty in artificial intelligence. pp. 340- 349 ,(2010)
Farzad Khorasani, Keval Vora, Rajiv Gupta, Laxmi N. Bhuyan, CuSha: vertex-centric graph processing on GPUs high performance distributed computing. pp. 239- 252 ,(2014) , 10.1145/2600212.2600227
Semih Salihoglu, Jennifer Widom, GPS: a graph processing system statistical and scientific database management. pp. 22- ,(2013) , 10.1145/2484838.2484843
Nisheeth Shrivastava, Anirban Majumder, Rajeev Rastogi, Mining (Social) Network Graphs to Detect Random Link Attacks 2008 IEEE 24th International Conference on Data Engineering. pp. 486- 495 ,(2008) , 10.1109/ICDE.2008.4497457
Douglas Gregor, Andrew Lumsdaine, Lifting sequential graph algorithms for distributed-memory parallel computation Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming systems languages and applications - OOPSLA '05. ,vol. 40, pp. 423- 437 ,(2005) , 10.1145/1094811.1094844
Jon M. Kleinberg, Authoritative sources in a hyperlinked environment symposium on discrete algorithms. pp. 668- 677 ,(1998) , 10.5555/314613.315045
Duane Merrill, Michael Garland, Andrew Grimshaw, Scalable GPU graph traversal acm sigplan symposium on principles and practice of parallel programming. ,vol. 47, pp. 117- 128 ,(2012) , 10.1145/2145816.2145832