Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect

作者: Nathan R. Tallent , Shuaiwen Leon Song , Jiajia Li , Jieyang Chen , Kevin J. Barker

DOI: 10.1109/TPDS.2019.2928289

关键词:

摘要: … 6A and 6B, we can observe slight NUMA effects on PCIe accesses: two GPUs sharing the same PCIe switch (eg, G2 and G3 in Fig. 1) exhibit lower bandwidth in the measurement. For …

参考文章(31)
Ang Li, Y.C. Tay, Akash Kumar, Henk Corporaal, Transit: A Visual Analytical Model for Multithreaded Machines high performance distributed computing. pp. 101- 106 ,(2015) , 10.1145/2749246.2749265
Kyle Spafford, Jeremy S. Meredith, Jeffrey S. Vetter, Quantifying NUMA and contention effects in multi-GPU systems Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units - GPGPU-4. pp. 11- ,(2011) , 10.1145/1964179.1964194
Tal Ben-Nun, Ely Levy, Amnon Barak, Eri Rubin, Memory access patterns: the missing piece of the multi-GPU puzzle ieee international conference on high performance computing data and analytics. pp. 19- ,(2015) , 10.1145/2807591.2807611
Gwangsun Kim, Minseok Lee, Jiyun Jeong, John Kim, Multi-GPU System Design with Memory Networks international symposium on microarchitecture. pp. 484- 495 ,(2014) , 10.1109/MICRO.2014.55
Hao Wang, Sreeram Potluri, Devendar Bureddy, Carlos Rosales, Dhabaleswar K. Panda, GPU-Aware MPI on RDMA-Enabled Clusters: Design, Implementation and Evaluation IEEE Transactions on Parallel and Distributed Systems. ,vol. 25, pp. 2595- 2605 ,(2014) , 10.1109/TPDS.2013.222
Ang Li, Gert-Jan van den Braak, Henk Corporaal, Akash Kumar, Fine-Grained Synchronizations and Dataflow Programming on GPUs international conference on supercomputing. pp. 109- 118 ,(2015) , 10.1145/2751205.2751232
Dimitrios Ziakas, Allen Baum, Robert A. Maddox, Robert J. Safranek, Intel® QuickPath Interconnect Architectural Features Supporting Scalable System Architectures 2010 18th IEEE Symposium on High Performance Interconnects. pp. 1- 6 ,(2010) , 10.1109/HOTI.2010.24
Hao Wang, Sreeram Potluri, Miao Luo, Ashish Kumar Singh, Sayantan Sur, Dhabaleswar K. Panda, MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters Computer Science - Research and Development. ,vol. 26, pp. 257- 266 ,(2011) , 10.1007/S00450-011-0171-3
Ang Li, Gert-Jan van den Braak, Akash Kumar, Henk Corporaal, Adaptive and transparent cache bypassing for GPUs ieee international conference on high performance computing data and analytics. pp. 17- ,(2015) , 10.1145/2807591.2807606
Simon Pabst, Artur Koch, Wolfgang Straßer, Fast and Scalable CPU/GPU Collision Detection for Rigid and Deformable Surfaces Computer Graphics Forum. ,vol. 29, pp. 1605- 1612 ,(2010) , 10.1111/J.1467-8659.2010.01769.X