作者: Franck Cappello , Daniel Etiemble
关键词:
摘要: The hybrid memory model of clusters multiprocessors raises two issues: programming and performance. Many parallel programs have been written by using the MPI standard. To evaluate pertinence models for existing codes, we compare a unified (MPI) one (OpenMP fine grain parallelization after profiling) NAS 2.3 benchmarks on IBM SP systems. superiority depends 1) level shared parallelization, 2) communication patterns 3) access patterns. relative speeds main architecture components (CPU, memory, network) are tremendous importance selecting model. With used model, our results show that approach is better most benchmarks. becomes only when fast processors make performance significant sufficient.