作者: Piotr Luszczek , Jack Dongarra
DOI: 10.1007/978-3-642-31464-3_74
关键词: Computation 、 Linear system 、 Shared memory 、 Parallel algorithm 、 Supercomputer 、 Observational error 、 Linear algebra 、 Parallel computing 、 Computer science 、 Scaling
摘要: We present a modeling framework to accurately predict time run dense linear algebra calculation. report the framework's accuracy in number of varied computational environments such as shared memory multicore systems, clusters, and large supercomputing installations with tens thousands cores. also test for various algorithms, each which having different scaling properties tolerance low-bandwidth/high-latency interconnects. The predictive is very good on order measurement makes method suitable both dedicated non-dedicated environments. practical application our model reduce required tune optimize parallel runs whose dominated by computations. show examples how apply methodology avoid common pitfalls influence errors inherent performance variability.