作者: Masahiro Nakao , Hitoshi Murai , Takenori Shimosaka , Akihiro Tabuchi , Toshihiro Hanawa
关键词:
摘要: The present paper introduces the XcalableACC (XACC) programming model, which is a hybrid model of XcalableMP (XMP) Partitioned Global Address Space (PGAS) language and OpenACC. XACC defines directives that enable programmers to mix XMP OpenACC in order develop applications can use accelerator clusters with ease. Moreover, improve performance stencil applications, Omni compiler provides functions transfer halo region on memory via Tightly Coupled Accelerators (TCA), proprietary network for transferring data directly among accelerators. In paper, we evaluate productivity through implementations HIMENO Benchmark. results show thanks improvements, requires less than half source lines code compare combination Message Passing Interface (MPI) OpenACC, commonly used together as typical model. As result these using TCA achieved up 2.7 times faster could be obtained MPI GPUDirect RDMA over InfiniBand.