Parallel Algorithms for Successive Convolution

作者: Andrew J. Christlieb , Pierson T. Guthrey , William A. Sands , Mathialakan Thavappiragasm

DOI: 10.1007/S10915-020-01359-X

关键词:

摘要: The development of modern computing architectures with ever-increasing amounts parallelism has allowed for the solution previously intractable problems across a variety scientific disciplines. Despite these advances, multiscale continue to pose an incredible challenge because they require resolving scales that often vary by orders magnitude in both space and time. Such complications have led us consider alternative discretizations partial differential equations (PDEs) which use expansions involving integral operators approximate spatial derivatives (Christlieb et al. J Comput Phys 379:214–236, 2019; Christlieb Sci 82:52(3):1–29, 2020; 415:1–25, 2020). These constructions explicit information within terms, but treat boundary data implicitly, contributes overall speed method. This approach is provably unconditionally stable linear stability been demonstrated experimentally nonlinear problems. Additionally, it matrix-free sense not necessary invert systems iteration required terms. Moreover, scheme employs fast summation algorithm yields method computational complexity $${\mathcal {O}}(N)$$ , where N number mesh points along coordinate direction. While much work done explore theory behind methods, their practicality large scale environments largely unexplored topic. In this work, we performance methods developing domain decomposition suitable distributed memory shared algorithms. As first pass, derive artificial Courant–Friedrichs–Lewy condition enforces nearest-neighbor (N-N) communication pattern briefly discuss possible generalizations. We also analyze several approaches implementing parallel algorithms optimizing predominant loop structures maximizing reuse. Using hybrid design MPI Kokkos (Edwards Trott Parallel Distrib 74:3202–3216, 2014) components algorithms, respectively, show our are efficient can sustain update rate $$> 1\times 10^8$$ DOF/node/s. provide results demonstrate scalability versatility using different PDE test problems, including example, adaptive time-stepping rule.

参考文章(36)
Andrew J. Christlieb, Yaman Guclu, Eric Wolf, Matthew F. Causley, Method of Lines Transpose: A Fast Implicit Wave Propagator arXiv: Numerical Analysis. ,(2013)
Matthew F. Causley, Andrew J. Christlieb, Higher Order A-Stable Schemes for the Wave Equation Using a Successive Convolution Approach SIAM Journal on Numerical Analysis. ,vol. 52, pp. 220- 235 ,(2014) , 10.1137/130932685
Matthew Causley, Andrew Christlieb, Benjamin Ong, Lee Van Groningen, Method of lines transpose: An implicit solution to the wave equation Mathematics of Computation. ,vol. 83, pp. 2763- 2786 ,(2014) , 10.1090/S0025-5718-2014-02834-2
Haitao Wang, Ting Lei, Jin Li, Jingfang Huang, Zhenhan Yao, A parallel fast multipole accelerated integral equation scheme for 3D Stokes equations International Journal for Numerical Methods in Engineering. ,vol. 70, pp. 812- 839 ,(2007) , 10.1002/NME.1910
Nathan Albin, Oscar P. Bruno, A spectral FC solver for the compressible Navier-Stokes equations in general domains I: Explicit time-stepping Journal of Computational Physics. ,vol. 230, pp. 6248- 6270 ,(2011) , 10.1016/J.JCP.2011.04.023
Jim Douglas, Alternating direction methods for three space variables Numerische Mathematik. ,vol. 4, pp. 41- 63 ,(1962) , 10.1007/BF01386295
Mary Catherine A. Kropinski, Bryan D. Quaife, Fast integral equation methods for Rothe's method applied to the isotropic heat equation Computers & Mathematics With Applications. ,vol. 61, pp. 2436- 2446 ,(2011) , 10.1016/J.CAMWA.2011.02.024