MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks

作者: Franck Cappello , Daniel Etiemble

DOI: 10.5555/370049.370071

关键词:

摘要: The hybrid memory model of clusters multiprocessors raises two issues: programming and performance. Many parallel programs have been written by using the MPI standard. To evaluate pertinence models for existing codes, we compare a unified (MPI) one (OpenMP fine grain parallelization after profiling) NAS 2.3 benchmarks on IBM SP systems. superiority depends 1) level shared parallelization, 2) communication patterns 3) access patterns. relative speeds main architecture components (CPU, memory, network) are tremendous importance selecting model. With used model, our results show that approach is better most benchmarks. becomes only when fast processors make performance significant sufficient.

参考文章(13)
D.J. Scales, K. Gharachorloo, A. Aggarwal, Fine-grain software distributed shared memory on SMP clusters high-performance computer architecture. pp. 125- 136 ,(1998) , 10.1109/HPCA.1998.650552
F. Cappello, O. Richard, D. Etiemble, Investigating the performance of two programming models for clusters of SMP PCs high performance computer architecture. pp. 349- 359 ,(2000) , 10.1109/HPCA.2000.824364
Steve W. Bova, Clay P. Breshears, Christine E. Cuicchi, Zeki Demirbilek, Henry A. Gabb, Dual-Level Parallel Analysis of Harbor Wave Response Using MPI and OpenMP ieee international conference on high performance computing data and analytics. ,vol. 14, pp. 49- 64 ,(2000) , 10.1177/109434200001400104
Steven S. Lumetta, Alan M. Mainwaring, David E. Culler, Multi-protocol active messages on a cluster of SMP's conference on high performance computing (supercomputing). pp. 1- 22 ,(1997) , 10.1145/509593.509596
Frederick C. Wong, Richard P. Martin, Remzi H. Arpaci-Dusseau, David E. Culler, Architectural Requirements and Scalability of the NAS Parallel Benchmarks conference on high performance computing (supercomputing). pp. 41- 41 ,(1999) , 10.1145/331532.331573
Hongzhang Shan, Jaswinder Pal Singh, A comparison of MPI, SHMEM and cache-coherent shared address space programming models on the SGI Origin2000 international conference on supercomputing. pp. 329- 338 ,(1999) , 10.1145/305138.305210
Patrick H. Worley, Performance evaluation of the IBM SP and the Compaq AlphaServer SC international conference on supercomputing. pp. 235- 244 ,(2000) , 10.1145/335231.335254
Y.Charlie Hu, Honghui Lu, Alan L. Cox, Willy Zwaenepoel, OpenMP for Networks of SMPs Journal of Parallel and Distributed Computing. ,vol. 60, pp. 1512- 1530 ,(2000) , 10.1006/JPDC.2000.1658
Andrew Erlichson, Neal Nuckolls, Greg Chesson, John Hennessy, SoftFLASH Proceedings of the seventh international conference on Architectural support for programming languages and operating systems - ASPLOS-VII. ,vol. 31, pp. 210- 220 ,(1996) , 10.1145/237090.237187