作者: Neil G. Dickson , Kamran Karimi , Firas Hamze
DOI: 10.1016/J.JCP.2011.03.041
关键词: Computational science 、 Graphics processing unit 、 CPU shielding 、 Central processing unit 、 Speedup 、 Software performance testing 、 Vectorization (mathematics) 、 Parallel computing 、 Computer science 、 CPU modes 、 CUDA
摘要: … the CPU and the GPU implementations. Section 3 shows how different parts of the code were vectorized. This section also explains how memory coalescing for GPU was performed. …