Sparse matrix-vector multiplication on graphics processor units

作者: Muthu M. Baskaran , Rajesh J. Bordawekar

DOI:

关键词:

摘要: Techniques for optimizing sparse matrix-vector multiplication (SpMV) on a graphics processing unit (GPU) are provided. The techniques include receiving multiplication, analyzing the to identify one or more optimizations, wherein optimizations comprises non-zero pattern and determining whether is be reused across computation, global memory access, shared access exploiting reuse parallelism, outputting an optimized multiplication.

参考文章(21)
Terrence A. Lenahan, Kuang-Wei Chiang, Jue Wang, Methods and mechanisms for inserting metal fill data ,(2004)
Eun-Jin Im, Katherine Yelick, Optimizing Sparse Matrix Computations for Register Reuse in SPARSITY international conference on computational science. pp. 127- 136 ,(2001) , 10.1007/3-540-45545-0_22
Luc Buatois, Guillaume Caumon, Bruno Lévy, Concurrent number cruncher: an efficient sparse linear solver on the GPU high performance computing and communications. ,vol. 4782, pp. 358- 371 ,(2007) , 10.1007/978-3-540-75444-2_37
Guy E. Blelloch, Prefix sums and their applications Carnegie Mellon University. ,(2004) , 10.1184/R1/6608579.V1
Richard W. Vuduc, Hyun-Jin Moon, Fast Sparse Matrix-Vector Multiplication by Exploiting Variable Block Structure High Performance Computing and Communications. pp. 807- 816 ,(2005) , 10.1007/11557654_91
John Mellor-Crummey, John Garvin, Optimizing Sparse Matrix-Vector Product Computations Using Unroll and Jam ieee international conference on high performance computing data and analytics. ,vol. 18, pp. 225- 236 ,(2004) , 10.1177/1094342004038951
Samuel Williams, Leonid Oliker, Richard Vuduc, John Shalf, Katherine Yelick, James Demmel, Optimization of sparse matrix-vector multiplication on emerging multicore platforms Proceedings of the 2007 ACM/IEEE conference on Supercomputing - SC '07. pp. 38- ,(2007) , 10.1145/1362622.1362674
Patrick Haffner, Fast transpose methods for kernel learning on sparse data Proceedings of the 23rd international conference on Machine learning - ICML '06. pp. 385- 392 ,(2006) , 10.1145/1143844.1143893
Yuri Dotsenko, Naga K. Govindaraju, Peter-Pike Sloan, Charles Boyd, John Manferdelli, Fast scan algorithms on graphics processors Proceedings of the 22nd annual international conference on Supercomputing - ICS '08. pp. 205- 213 ,(2008) , 10.1145/1375527.1375559