A Comparison of Compiler Tiling Algorithms

作者: Gabriel Rivera , Chau-Wen Tseng

DOI: 10.1007/978-3-540-49051-7_12

关键词: AlgorithmCompilerComputer scienceOptimizing compilerLinear algebraMatrix multiplicationPaddingCPU cacheParallel computingProgram optimizationLoop tiling

摘要: Linear algebra codes contain data locality which can be exploited by tiling multiple loop nests. Several approaches to have been suggested for avoiding conflict misses in low associativity caches. We propose a new technique based on intra-variable padding and compare its performance with existing techniques. Results show improves of matrix multiply over 100% some cases range sizes. Comparing the efficacy different algorithms, we discover rectangular tiles are slightly more efficient than square tiles. Overall, from 0-250%. Copying at run time proves quite effective.

参考文章(27)
Karim Esseghir, Improving data locality for caches ,(1993)
Nicholas Mitchell, Larry Carter, Jeanne Ferrante, Karin Högstedt, Quantifying the Multi-level Nature of Tiling Interactions languages and compilers for parallel computing. pp. 1- 15 ,(1997) , 10.1007/BFB0032680
Thomas J. Watson IBM Research Center, On Estimating and Enhancing Cache Effectiveness languages and compilers for parallel computing. pp. 328- 343 ,(1991) , 10.1007/BFB0038674
David F. Bacon, Jyh-Herng Chow, Dz-ching R. Ju, Kalyan Muthukumar, Vivek Sarkar, A compiler framework for restructuring data declarations to enhance cache and TLB effectiveness conference of the centre for advanced studies on collaborative research. pp. 3- ,(2010) , 10.1145/1925805.1925813
Michał Cierniak, Wei Li, Unifying data and control transformations for distributed shared-memory machines Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation - PLDI '95. ,vol. 30, pp. 205- 217 ,(1995) , 10.1145/207110.207145
Gabriel Rivera, Chau-Wen Tseng, Eliminating conflict misses for high performance architectures international conference on supercomputing. pp. 353- 360 ,(1998) , 10.1145/277830.277917
O. Temam, C. Fricker, W. Jalby, Cache interference phenomena Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems - SIGMETRICS '94. ,vol. 22, pp. 261- 271 ,(1994) , 10.1145/183018.183047
David H. Bailey, Unfavorable Strides in Cache Memory Systems (RNR Technical Report RNR-92-015) Scientific Programming. ,vol. 4, pp. 53- 58 ,(1995) , 10.1155/1995/937016
M. Kandemir, J. Ramanujam, A. Choudhary, A compiler algorithm for optimizing locality in loop nests international conference on supercomputing. pp. 269- 276 ,(1997) , 10.1145/263580.263650
M. Wolfe, More iteration space tiling Proceedings of the 1989 ACM/IEEE conference on Supercomputing - Supercomputing '89. pp. 655- 664 ,(1989) , 10.1145/76263.76337