Data-Parallel BLAS as a Basis for LAPACK on Massively Parallel Computers

作者： P. E. Bjørstad , T. Sørevik

关键词:

摘要: We consider a data-parallel implementation of LU-factorization based on the LAPACK routine DGETRF. analyze performance required BLAS routines and show that high is inhibited by current compiler limitations. In particular, we optimal data movement when performing rank-1 updates crucial. The update available as BLAS-2 can also easily be expressed using intrinsic SPREAD in Fortran 90. However, order to minimize processor communication, this operation should explicitly inlined computational kernels. Using observation identify need for an explicit applied single block. With freedom adjust block-size hardware, much simpler task than writing full code low level machine language. extension, achievable without modifying block structure routine. expect similar observations hold other modules LAPACK.

springer.com 本地加速

sci-hub.st HTML 下载加速

参考文章(12)

Petter E. Bjørstad, Tor Sørevik, Two Different Data-Parallel Implementations of the BLAS Springer, Berlin, Heidelberg. pp. 294- 307 ,(1993) , 10.1007/978-3-642-58049-9_21

Software for parallel computation Springer Berlin Heidelberg. ,(1993) , 10.1007/978-3-642-58049-9

Ed Anderson, Lapack Users' Guide ,(1995)

F. T. Krogh, C. L. Lawson, R. J. Hanson, A proposal for standard linear algebra subprograms ,(1973)

J.J. Dongarra, S. Hammarling, J. Du Croz, R.J. Hanson, An extended set of Fortran Basic Linear Algebra Subprograms: model implementation and test programs ,(1987)

R. J. Hanson, F. T. Krogh, C. L. Lawson, Improving the efficiency of portable software for linear algebra ACM Signum Newsletter. ,vol. 8, pp. 16- 16 ,(1973) , 10.1145/1052646.1052653

Siddhartha Chatterjee, John R. Gilbert, Robert Schreiber, Shang-Hua Teng, Optimal evaluation of array expressions on massively parallel machines ACM Transactions on Programming Languages and Systems. ,vol. 17, pp. 123- 156 ,(1995) , 10.1145/200994.201004

C. L. Lawson, R. J. Hanson, D. R. Kincaid, F. T. Krogh, Basic Linear Algebra Subprograms for Fortran Usage ACM Transactions on Mathematical Software. ,vol. 5, pp. 308- 323 ,(1979) , 10.1145/355841.355847

Jack J. Dongarra, Jeremy Croz and Sven Hammarling and Richard J., Corrigenda: “An Extended Set of FORTRAN Basic Linear Algebra Subprograms” ACM Transactions on Mathematical Software. ,vol. 14, pp. 399- ,(1988) , 10.1145/50063.356256

10.

J. J. Dongarra, Jeremy Du Croz, Sven Hammarling, I. S. Duff, A set of level 3 basic linear algebra subprograms ACM Transactions on Mathematical Software. ,vol. 16, pp. 1- 17 ,(1990) , 10.1145/77626.79170

Data-Parallel BLAS as a Basis for LAPACK on Massively Parallel Computers

来源期刊

我的账户

Data-Parallel BLAS as a Basis for LAPACK on Massively Parallel Computers

来源期刊

相似文章 4

Two Different Data-Parallel Implementations of the BLAS

Large Scale Structural Analysis on Massively Parallel Computers

Parallelizing a level 3 BLAS library for LAN-connected workstations

Processor-efficient sparse matrix-vector multiplication

我的账户