作者: Andrew J. Christlieb , Pierson T. Guthrey , William A. Sands , Mathialakan Thavappiragasm
DOI: 10.1007/S10915-020-01359-X
关键词:
摘要: The development of modern computing architectures with ever-increasing amounts parallelism has allowed for the solution previously intractable problems across a variety scientific disciplines. Despite these advances, multiscale continue to pose an incredible challenge because they require resolving scales that often vary by orders magnitude in both space and time. Such complications have led us consider alternative discretizations partial differential equations (PDEs) which use expansions involving integral operators approximate spatial derivatives (Christlieb et al. J Comput Phys 379:214–236, 2019; Christlieb Sci 82:52(3):1–29, 2020; 415:1–25, 2020). These constructions explicit information within terms, but treat boundary data implicitly, contributes overall speed method. This approach is provably unconditionally stable linear stability been demonstrated experimentally nonlinear problems. Additionally, it matrix-free sense not necessary invert systems iteration required terms. Moreover, scheme employs fast summation algorithm yields method computational complexity $${\mathcal {O}}(N)$$ , where N number mesh points along coordinate direction. While much work done explore theory behind methods, their practicality large scale environments largely unexplored topic. In this work, we performance methods developing domain decomposition suitable distributed memory shared algorithms. As first pass, derive artificial Courant–Friedrichs–Lewy condition enforces nearest-neighbor (N-N) communication pattern briefly discuss possible generalizations. We also analyze several approaches implementing parallel algorithms optimizing predominant loop structures maximizing reuse. Using hybrid design MPI Kokkos (Edwards Trott Parallel Distrib 74:3202–3216, 2014) components algorithms, respectively, show our are efficient can sustain update rate $$> 1\times 10^8$$ DOF/node/s. provide results demonstrate scalability versatility using different PDE test problems, including example, adaptive time-stepping rule.