Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modeling

关键词: Computation 、 Linear system 、 Shared memory 、 Parallel algorithm 、 Supercomputer 、 Observational error 、 Linear algebra 、 Parallel computing 、 Computer science 、 Scaling

摘要: We present a modeling framework to accurately predict time run dense linear algebra calculation. report the framework's accuracy in number of varied computational environments such as shared memory multicore systems, clusters, and large supercomputing installations with tens thousands cores. also test for various algorithms, each which having different scaling properties tolerance low-bandwidth/high-latency interconnects. The predictive is very good on order measurement makes method suitable both dedicated non-dedicated environments. practical application our model reduce required tune optimize parallel runs whose dominated by computations. show examples how apply methodology avoid common pitfalls influence errors inherent performance variability.

springer.com 本地加速

参考文章(37)

J. Dongarra, L. S. Blackford, J. Demmel, A. Petitet, I. Dhillon, E. D'Azevedo, R. C. Whaley, G. Henry, K. Stanley, J. Choi, S. Hammarling, A. Cleary, D. Walker, ScaLAPACK Users' Guide ,(1987)

Darren J. Kerbyson, Adolfy Hoisie, Harvey J. Wasserman, Verifying large-scale system performance during installation using modelling High performance scientific and engineering computing. pp. 143- 156 ,(2004) , 10.1007/978-1-4757-5402-5_10

L. H. G. Sterne, Antonio Ferri, Dietrich Küchemann, Progress in aeronautical sciences In-house reproduction. ,(1961)

Luis-Pedro García, Javier Cuenca, Domingo Giménez, Using experimental data to improve the performance modelling of parallel linear algebra routines parallel processing and applied mathematics. pp. 1150- 1159 ,(2007) , 10.1007/978-3-540-68111-3_122

Jack J. Dongarra, Danny C. Sorensen, Henk A. Vander Vorst, Lain S. Duff, Numerical Linear Algebra for High-Performance Computers ,(1998)

Hans W. Meuer, Erich Strohmaier, Jack J. Dongarra, Horst D. Simon, TOP500 Supercomputer sites 11/2000 Lawrence Berkeley National Laboratory. ,(2000) , 10.2172/843058

James H. Wilkinson, Rounding Errors in Algebraic Processes ,(1964)

Gene H. Golub, Charles F. Van Loan, Matrix computations (3rd ed.) Johns Hopkins University Press. ,(1996)

J.L. Hess, A.M.O. Smith, Calculation of potential flow about arbitrary bodies Progress in Aerospace Sciences. ,vol. 8, pp. 1- 138 ,(1967) , 10.1016/0376-0421(67)90003-6

10.

J. J. Dongarra, Jeremy Du Croz, Sven Hammarling, I. S. Duff, A set of level 3 basic linear algebra subprograms ACM Transactions on Mathematical Software. ,vol. 16, pp. 1- 17 ,(1990) , 10.1145/77626.79170

Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modeling

来源期刊

我的账户

Reducing the time to tune parallel dense linear algebra routines with partial execution and performance modeling

来源期刊

相似文章 10

我的账户