作者: Hari Radhakrishnan , Damian W. I. Rouson , Karla Morris , Sameer Shende , Stavros C. Kassinos
DOI: 10.1155/2015/904983
关键词: Speedup 、 Object-oriented programming 、 Shared memory 、 Computer science 、 Distributed memory 、 Multi-core processor 、 Scalability 、 Compiler 、 Parallel computing 、 Fortran
摘要: This paper summarizes a strategy for parallelizing legacy Fortran 77 program using the object-oriented (OO) and coarray features that entered in 2003 2008 standards, respectively. OO programming (OOP) facilitates construction of an extensible suite model-verification performance tests drive development. Coarray parallel rapid evolution from serial application to capable running on multicore processors many-core accelerators shared distributed memory. We delineate 17 code modernization steps used refactor parallelize study resulting performance. Our initial studies were done Intel compiler 32-core memory server. Scaling behavior was very poor, profile analysis TAU showed bottleneck due our implementation collective, sequential summation procedure. able improve scalability achieve nearly linear speedup by replacing with parallel, binary tree algorithm. also tested Cray compiler, which provides its own collective no reductions. With Cray, shows even distributed-memory execution. anticipate similar results other compilers once they support new procedures proposed 2015.