作者: Gerald Roth , John Mellor-Crummey , Ken Kennedy , R. Gregg Brickner
关键词:
摘要: For many Fortran90 and HPF programs performing dense matrix computations, the main computational portion of program belongs to a class kernels known as stencils. Stencil computations are commonly used in solving partial differential equations, image processing, geometric modeling. The efficient handling such stencils is critical for achieving high performance on distributed-memory machines. Compiling into code viewed so important that some companies have built special-purpose compilers them others added stencil-recognizers existing compilers.In this paper we present general compilation strategy written using array constructs. Our capable optimizing single or multi-statement applicable specified with shift intrinsics array-syntax all equally well. eliminates need pattern-recognition algorithms by orchestrating set optimizations address overhead both intraprocessor interprocessor data movement results from translation experimental show produced beats matches best schemes us. In addition, our produces highly optimized situations where fail, producing several orders magnitude improvement, thus provides stencil more robust than its predecessors.