作者: Linchuan Chen , Xin Huo , Bin Ren , Surabhi Jain , Gagan Agrawal
关键词:
摘要: Intel Xeon Phi (MIC architecture) is a relatively new accelerator chip, which combines large-scale shared memory parallelism with wide SIMD lanes. Mapping applications on anode such an architecture to achieve high parallel efficiency's major challenge. In this paper, we focus developing system for heterogeneous graph processing, able utilize both many-core and multi-core CPU ozone node. We propose simple programming API unintuitive interface expressing parallelism. develop efficient techniques supporting our high-level API, focusing exploiting lanes, massive number of cores, partitioning the work across accelerator, while handling irregularity applications. The components runtime include condensed static buffer, supports message insertion reduction keeping requirements low, specifically formic, pipelining scheme generation by avoiding frequent locking operations. Besides, hybrid module effectively partition workload between MIC, ensuring balanced low communication overhead. main observations from experimental evaluation using five popular are: formic executions, up 3.36x faster than naive approach based generation, speedup over OpenMP ranges 1.17 4.15. Heterogeneous-MIC execution achieves 1.41 better CPU-only MIC-only executions.