Compiling Stencils in High Performance Fortran

作者: Gerald Roth , John Mellor-Crummey , Ken Kennedy , R. Gregg Brickner

DOI: 10.1145/509593.509605

关键词:

摘要: For many Fortran90 and HPF programs performing dense matrix computations, the main computational portion of program belongs to a class kernels known as stencils. Stencil computations are commonly used in solving partial differential equations, image processing, geometric modeling. The efficient handling such stencils is critical for achieving high performance on distributed-memory machines. Compiling into code viewed so important that some companies have built special-purpose compilers them others added stencil-recognizers existing compilers.In this paper we present general compilation strategy written using array constructs. Our capable optimizing single or multi-statement applicable specified with shift intrinsics array-syntax all equally well. eliminates need pattern-recognition algorithms by orchestrating set optimizations address overhead both intraprocessor interprocessor data movement results from translation experimental show produced beats matches best schemes us. In addition, our produces highly optimized situations where fail, producing several orders magnitude improvement, thus provides stencil more robust than its predecessors.

参考文章(18)
Ken Kennedy, John Mellor-Crummey, Gerald Roth, Optimizing Fortran 90 shift operations on distributed-memory multicomputers Languages and Compilers for Parallel Computing. pp. 161- 175 ,(1996) , 10.1007/BFB0014198
Gerald H. Roth, Optimizing Fortran90D/HPF for distributed-memory computers Rice University. ,(1997)
Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, F. Kenneth Zadeck, Efficiently computing static single assignment form and the control dependence graph ACM Transactions on Programming Languages and Systems. ,vol. 13, pp. 451- 490 ,(1991) , 10.1145/115372.115320
Zeki Bozkus, Larry Meadows, Steven Nakamoto, Vincent Schuster, Mark Young, PGHPF—an optimizing High Performance Fortran compiler for distributed memory machines Scientific Programming. ,vol. 6, pp. 29- 40 ,(1997) , 10.1155/1997/705102
William George, Ralph G. Brickner, S. Lennart Johnsson, POLYSHIFT communications software for the connection machine system CM-200 Scientific Programming. ,vol. 3, pp. 83- 99 ,(1994) , 10.1155/1994/987498
Michael Gerndt, Updating distributed variables in local computations Concurrency and Computation: Practice and Experience. ,vol. 2, pp. 171- 193 ,(1990) , 10.1002/CPE.4330020303
Manish Gupta, Sam Midkiff, Edith Schonberg, Ven Seshadri, David Shields, Ko-Yang Wang, Wai-Mee Ching, Ton Ngo, An HPF Compiler for the IBM SP2 conference on high performance computing (supercomputing). pp. 71- 71 ,(1995) , 10.1145/224170.224422
Kathryn S. McKinley, Steve Carr, Chau-Wen Tseng, Improving data locality with loop transformations ACM Transactions on Programming Languages and Systems. ,vol. 18, pp. 424- 453 ,(1996) , 10.1145/233561.233564