作者: Philipp Grete , Brian W. O'Shea , Forrest W. Glines
DOI:
关键词:
摘要: Large scale simulations are a key pillar of modern research and require ever-increasing computational resources. Different novel manycore architectures have emerged in recent years on the way towards exascale era. Performance portability is required to prevent repeated non-trivial refactoring code for different architectures. We combine Athena++, an existing magnetohydrodynamics (MHD) CPU code, with Kokkos, performance portable on-node parallel programming paradigm, into K-Athena allow efficient multiple using single codebase. present profiling scaling results platforms including Intel Skylake CPUs, Xeon Phis, NVIDIA GPUs. achieves $>10^8$ cell-updates/s V100 GPU second-order double precision MHD calculations, speedup 30 up 24,576 GPUs Summit (compared 172,032 cores), reaching $1.94\times10^{12}$ total at 76% efficiency. Using roofline analysis we demonstrate that overall currently limited by DRAM bandwidth calculate metric 62.8%. Finally, implementation strategies used challenges encountered maximizing performance. This will provide other groups straightforward approach prepare their own codes available this https URL .