A Note on Auto-tuning GEMM for GPUs

作者： Yinan Li , Jack Dongarra , Stanimire Tomov

DOI: 10.1007/978-3-642-01970-8_89

关键词: Computer science 、 Double-precision floating-point format 、 Matrix multiplication 、 Computational science 、 Graphics 、 Key (cryptography) 、 Parallel computing 、 Single-precision floating-point format 、 Software portability 、 Linear algebra 、 CUDA

摘要: The development of high performance dense linear algebra (DLA) critically depends on highly optimized BLAS, and especially on the matrix multiplication routine (GEMM). This is …

uni-trier.de 本地加速

man.ac.uk 本地加速

core.ac.uk 本地加速

manchester.ac.uk 本地加速

hgpu.org 本地加速

acm.org 本地加速

manchester.ac.uk 本地加速

springer.com 本地加速

utk.edu PDF 下载加速

man.ac.uk PDF 下载加速

springer.com PDF 下载加速

manchester.ac.uk PDF 下载加速

sci-hub.st HTML 下载加速

参考文章(13)

R. Clint Whaley, Antoine Petitet, Jack J. Dongarra, New trends in high performance computing ieee international conference on high performance computing data and analytics. ,vol. 27, pp. 3- 35 ,(2001) , 10.1016/S0167-8191(00)00087-9

Jeff Bilmes, Krste Asanovic, Chee-Whye Chin, Jim Demmel, Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology international conference on supercomputing. pp. 253- 260 ,(1997) , 10.1145/2591635.2667174

J. Dongarra, G. Bosilca, Z. Chen, V. Eijkhout, G. E. Fagg, E. Fuentes, J. Langou, P. Luszczek, J. Pjesivac-Grbovic, K. Seymour, H. You, S. S. Vadhiyar, Self-adapting numerical software (SANS) effort Ibm Journal of Research and Development. ,vol. 50, pp. 223- 238 ,(2006) , 10.1147/RD.502.0223

John A. Gunnels, Fred G. Gustavson, Greg M. Henry, Robert A. van de Geijn, FLAME: Formal Linear Algebra Methods Environment ACM Transactions on Mathematical Software. ,vol. 27, pp. 422- 455 ,(2001) , 10.1145/504210.504213

James W. Demmel, Vasily Volkov, Benchmarking GPUs to tune dense linear algebra ieee international conference on high performance computing data and analytics. pp. 31- ,(2008) , 10.5555/1413370.1413402

M. Frigo, S.G. Johnson, FFTW: an adaptive software architecture for the FFT international conference on acoustics speech and signal processing. ,vol. 3, pp. 1381- 1384 ,(1998) , 10.1109/ICASSP.1998.681704

John Shalf, Krste Asanovic, Parry Husbands, Katherine A. Yelick, David A. Patterson, William Lester Plishker, Joseph James Gebis, Samuel Webb Williams, Ras Bodik, Bryan Christopher Catanzaro, Kurt Keutzer, The Landscape of Parallel Computing Research: A View from Berkeley ,(2006)

Jack Dongarra, Gregory Peterson, Stanimir Tomov, Jeff Allred, Vincent Natoli, David Richie, Exploring New Architectures in Accelerating CFD for Air Force Applications dod hpcmp users group conference. pp. 472- 478 ,(2008) , 10.1109/DOD.HPCMP.UGC.2008.12

J. Demmel, J. Dongarra, V. Eijkhout, E. Fuentes, A. Petitet, R. Vuduc, R.C. Whaley, K. Yelick, Self-Adapting Linear Algebra Algorithms and Software Proceedings of the IEEE. ,vol. 93, pp. 293- 312 ,(2005) , 10.1109/JPROC.2004.840848

10.

Stanimire Tomov, Jack Dongarra, Marc Baboulin, Towards dense linear algebra for hybrid GPU accelerated manycore systems parallel computing. ,vol. 36, pp. 232- 240 ,(2010) , 10.1016/J.PARCO.2009.12.005

A Note on Auto-tuning GEMM for GPUs

来源期刊

我的账户

A Note on Auto-tuning GEMM for GPUs

来源期刊

相似文章 10

我的账户