作者: Damien Genet , Abdou Guermouche , George Bosilca
DOI: 10.1007/978-3-319-14313-2_29
关键词: Scheduling (computing) 、 Parallel algorithm 、 Linear algebra 、 Computer science 、 Runtime system 、 Parallel computing 、 Finite element method 、 Directed acyclic graph 、 Matrix (mathematics) 、 Multi-core processor
摘要: Traditionally, numerical simulations based on finite element methods consider the algorithm as being divided in three major steps: generation of a set blocks and vectors, assembly these matrix big vector, inversion matrix. In this paper we tackle second step, block assembly, where no parallel is widely available. Several strategies are proposed to decompose problem while relying scheduling middle-ware maximize overlap between stages increase parallelism thus performance. These quantified using examples covering two extremes field, large number non-overlapping small for CFD-like problems, smaller larger with significant which can be met sparse linear algebra solvers.