Utilizing dynamic parallelism in CUDA to accelerate a 3D Red-Black Successive Over Relaxation wind-field solver

作者: Pete Willemsen , Eric R. Pardyjak , Jeremy A. Gibbs , Jeremy A. Gibbs , Jae-Jin Kim

DOI: 10.1016/J.ENVSOFT.2021.104958

关键词: Computational scienceVariational analysisCUDASuccessive over-relaxationLagrange multiplierSpeedupPoisson's equationSolverKernel (statistics)Computer science

摘要: Abstract QES-Winds is a fast-response wind modeling platform for simulating high-resolution mean fields optimization and prediction. The code uses variational analysis technique to solve the Poisson equation Lagrange multipliers obtain field GPU parallelization accelerate numerical solution of equation. benefits from CUDA dynamic parallelism (launching kernel GPU) speed up calculations by factor 128 compared serial solver domain with 145 million cells. enables calculate velocity domains sizes 10 km 2 horizontal resolutions 1 - 3 m in under 1 min. As result, suitable computing on large real time, which can be used model wide range real-world problems including wildfires urban air quality.

参考文章(41)
Gianni Tinarelli, G. Brusasca, O. Oldrini, Domenico Anfossi, Silvia Trini Castelli, J. Moussafir, Micro-Swift-Spray (MSS): A New Modelling System for the Simulation of Dispersion at Microscale. General Description and Validation Springer, Boston, MA. pp. 449- 458 ,(2007) , 10.1007/978-0-387-68854-1_49
Jr. Jules Joseph Lambiotte, The solution of linear systems of equations on a vector computer. University of Virginia. ,(1975)
Elias Konstantinidis, Yiannis Cotronis, Accelerating the red/black SOR method using GPUs with CUDA parallel processing and applied mathematics. pp. 589- 598 ,(2011) , 10.1007/978-3-642-31464-3_60
GREGORY F HOMICZ, Three-Dimensional Wind Field Modeling: A Review Other Information: PBD: 1 Aug 2002. ,(2002) , 10.2172/801406
Ke Ding, Ying Tan, Attract-Repulse Fireworks Algorithm and its CUDA Implementation Using Dynamic Parallelism International Journal of Swarm Intelligence Research. ,vol. 6, pp. 1- 31 ,(2015) , 10.4018/IJSIR.2015040101
E.R. Pardyjak, M.J. Brown, EVALUATION OF A FAST-RESPONSE URBAN WIND MODEL - COMPARISON TO SINGLE-BUILDING WIND TUNNEL DATA Conference title not supplied, Conference location not supplied, Conference dates not supplied. ,(2001)
Flavio Lombardi, Roberto Di Pietro, Antonio Villani, CUDA Leaks: Information Leakage in GPU Architectures arXiv: Cryptography and Security. ,(2013) , 10.1145/2801153
Wen-mei W. Hwu, David B. Kirk, Programming Massively Parallel Processors: A Hands-on Approach Morgan Kaufmann. ,(2012)
Ruipeng Li, Yousef Saad, GPU-accelerated preconditioned iterative linear solvers The Journal of Supercomputing. ,vol. 63, pp. 443- 466 ,(2013) , 10.1007/S11227-012-0825-3
Rudi Helfenstein, Jonas Koko, Parallel preconditioned conjugate gradient algorithm on GPU Journal of Computational and Applied Mathematics. ,vol. 236, pp. 3584- 3590 ,(2012) , 10.1016/J.CAM.2011.04.025