作者: Shizhao Chen , Jianbin Fang , Donglin Chen , Chuanfu Xu , Zheng Wang
DOI: 10.1109/HPCC/SMARTCITY/DSS.2018.00116
关键词:
摘要: Sparse matrix vector multiplication (SpMV) is one of the most common operations in scientific and high-performance applications, often responsible for application performance bottleneck. While sparse representation has a significant impact on resulting performance, choosing right typically relies expert knowledge trial error. This paper provides first comprehensive study representations two emerging many-core architectures: Intel's Knights Landing (KNL) XeonPhi ARM-based FT-2000Plus (FTP). Our large-scale experiments involved over 9,500 distinct profiling runs performed 956 datasets five mainstream SpMV representations. We show that best depends underlying architecture program input. To help developers to choose optimal representation, we employ machine learning develop predictive model. model trained offline using set training examples. The learned can be used predict any unseen input given architecture. our delivers average 95% 91% available KNL FTP respectively, it achieves this with no runtime overhead.