High order accurate simulation of compressible flows on GPU clusters over Software Distributed Shared Memory

作者: Konstantinos I. Karantasis , Eleftherios D. Polychronopoulos , John A. Ekaterinaris

DOI: 10.1016/J.COMPFLUID.2014.01.005

关键词: Message passingMulti-core processorShared memoryTheoretical computer scienceMemory hierarchyDistributed memoryMultiprocessingCUDA Pinned memoryComputer scienceGPU clusterParallel computing

摘要: Abstract The advent of multicore processors during the past decade and especially recent introduction many-core Graphics Processing Units (GPUs) open new horizons to large-scale, high-resolution simulations for a broad range scientific fields. Residing at forefront advancements in multiprocessor technology, GPUs are often chosen as co-processors when intensive parts applications need be computed. Among various domains, area Computational Fluid Dynamics (CFD) is potential candidate that could significantly benefit from utilization GPUs. In order investigate this possibility, we herein evaluate performance high accurate method simulation compressible flows. Targeting computer systems with multiple GPUs, current implementation respective evaluation taking place on GPU cluster. With respect using these paper offers an alternative mainstream approach message passing by considering shared memory abstraction. implementations presented paper, updates data not explicitly coded programmer across phases, but propagated through Software Distributed Shared Memory (SDSM). This way, intend preserve unified view extends hierarchy node level cluster level. Such extension facilitate porting multithreaded codes clusters. Our results indicate competitive paradigm they lay grounds further research use abstraction future

参考文章(63)
Katherine Yelick, William Carlson, Thomas Sterling, Tarek El-Ghazawi, UPC: Distributed Shared Memory Programming (Wiley Series on Parallel and Distributed Computing) UPC: Distributed Shared Memory Programming (Wiley Series on Parallel and Distributed Computing). ,(2005)
T Hoefler, A Friedley, A Lumsdaine, M L Leininger, Scalable High Performance Message Passing over InfiniBand for Open MPI Presented at: Communication in Clusters and Cluster Computer Interconnected Systems, Aachen, Germany, Dec 12 - Dec 12, 2007. ,(2007)
Albert Sidelnik, David Padua, Bradford L. Chamberlain, Maria J. Garzaran, Using the High Productivity Language Chapel to Target GPGPU Architectures hgpu.org. ,(2011)
C. Dachsbacher, M. Strengert, T. Ertl, C. Müller, CUDASA: compute unified device and systems architecture eurographics workshop on parallel graphics and visualization. pp. 49- 56 ,(2008) , 10.5555/2386173.2386183
Eduard Ayguadé, Rosa M. Badia, Francisco D. Igual, Jesús Labarta, Rafael Mayo, Enrique S. Quintana-Ortí, An Extension of the StarSs Programming Model for Platforms with Multiple GPUs european conference on parallel processing. ,vol. 5704, pp. 851- 862 ,(2009) , 10.1007/978-3-642-03869-3_79
Yonghong Yan, Max Grossman, Vivek Sarkar, JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA european conference on parallel processing. ,vol. 5704, pp. 887- 899 ,(2009) , 10.1007/978-3-642-03869-3_82
F. Nicoud, F. Ducros, Subgrid-scale stress modelling based on the square of the velocity gradient tensor Flow Turbulence and Combustion. ,vol. 62, pp. 183- 200 ,(1999) , 10.1023/A:1009995426001
Ali Cevahir, Akira Nukada, Satoshi Matsuoka, Fast Conjugate Gradients with Multiple GPUs international conference on computational science. pp. 893- 903 ,(2009) , 10.1007/978-3-642-01970-8_90
Message P Forum, None, MPI: A Message-Passing Interface Standard University of Tennessee. ,(1994)
Matt Pharr, Randima Fernando, Gpu gems 2: programming techniques for high-performance graphics and general-purpose computation high performance graphics. pp. 880- 880 ,(2005)