Using GPUs to improve multigrid solver performance on a cluster

作者: Dominik Goddeke , Robert Strzodka , Jamaludin Mohd Yusof , Patrick McCormick , Hilmar Wobker

DOI: 10.1504/IJCSE.2008.021111

关键词: Parallel computingDomain decomposition methodsOpenGLComputational scienceMultigrid methodSoftwareIterative refinementHardware accelerationFinite element methodGraphicsComputer science

摘要: This paper explores the coupling of coarse and fine-grained parallelism for Finite Element (FE) simulations based on efficient parallel multigrid solvers. The focus lies both system performance a minimally invasive integration hardware acceleration into an existing software package, requiring no changes to application code. Because their excellent price ratio, we demonstrate viability our approach by using commodity Graphics Processing Units (GPUs), addressing issue limited precision GPUs applying mixed precision, iterative refinement technique. Our results show that do not compromise any functionality gain speedups two more large problems.

参考文章(52)
Hans Meuer, E. Strohmaier, J. Dongarra, Horst Simon, Top500 Supercomputer Sites University of Tennessee. ,(1997)
Martin Rumpf, Robert Strzodka, Graphics Processor Units: New Prospects for Parallel Computing Springer, Berlin, Heidelberg. pp. 89- 132 ,(2006) , 10.1007/3-540-31619-1_3
John Waldron, Owen Harrison, Optimising data movement rates for parallel processing applications on graphics processors Parallel and distributed computing and networks. pp. 251- 256 ,(2007)
William D. Gropp, Barry F. Smith, Petter E. Bjørstad, Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations ,(1996)
Andrea Toselli, Olof B. Widlund, Domain decomposition methods : algorithms and theory Published in <b>2005</b> in Berlin by Springer. ,(2005) , 10.1007/B137868
André DeHon, Very Large Scale Spatial Computing Lecture Notes in Computer Science. pp. 27- 36 ,(2002) , 10.1007/3-540-45833-6_3
D. Pham, S. Asano, M. Bolliger, M.N. Day, H.P. Hofstee, C. Johns, J. Kahle, A. Kameyama, J. Keaty, Y. Masubuchi, M. Riley, D. Shippy, D. Stasiak, M. Suzuoki, M. Wang, J. Warnock, S. Weitzel, D. Wendel, T. Yamazaki, K. Yazawa, The design and implementation of a first-generation CELL processor international solid-state circuits conference. pp. 184- 592 ,(2005) , 10.1109/ISSCC.2005.1493930
Craig C. Douglas, Jonathan Hu, Wolfgang Karl, Markus Kowarschik, Ulrich Rüde, Christian Weiß, Fixed and Adaptive Cache Aware Algorithms for Multigrid Methods Springer, Berlin, Heidelberg. pp. 87- 93 ,(2000) , 10.1007/978-3-642-58312-4_11