Profiling Data-Dependence to Assist Parallelization: Framework, Scope, and Optimization

作者: Alain Ketterlin , Philippe Clauss

DOI: 10.1109/MICRO.2012.47

关键词:

摘要: This paper describes a tool using one or more executions of sequential program to detect parallel portions the program. The tool, called Par wiz, uses dynamic binary instrumentation, targets various forms parallelism, and suggests distinct parallelization actions, ranging from simple directive tagging elaborate loop transformations. first part details link between program's static structures (like routines loops), memory accesses performed by program, dependencies that are used highlight potential parallelism. also instrumentation involved, general architecture system. second puts framework into action. study focuses on targeting OpenMP parallel-for directives, including privatization when necessary. is an adaptation well-known vectorization technique based slightly richer dependence description, where transformation. third views loops as graph (hopefully lightly) dependent iterations. explains how overall cost data-dependence profiling can be reduced. has two major causes: first, instrumenting slows down second, turning graphs consumes processing time. wiz analysis original (binary) provide data at coarser level, moving individual complete whenever possible, thereby reducing impact both sources inefficiency.

参考文章(29)
Minjang Kim, Hyesoon Kim, Dynamic program analysis algorithms to assist parallelization Georgia Institute of Technology. ,(2012)
Tong Chen, Jin Lin, Xiaoru Dai, Wei-Chung Hsu, Pen-Chung Yew, Data Dependence Profiling for Speculative Optimizations compiler construction. pp. 57- 72 ,(2004) , 10.1007/978-3-540-24723-4_5
Alain Darte, Frédéric Vivien, On the Optimality of Allen and Kennedy's Algorithm for Parallel Extraction in Nested Loops european conference on parallel processing. ,vol. 1123, pp. 379- 388 ,(1996) , 10.1007/3-540-61626-8_50
Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, F. Kenneth Zadeck, Efficiently computing static single assignment form and the control dependence graph ACM Transactions on Programming Languages and Systems. ,vol. 13, pp. 451- 490 ,(1991) , 10.1145/115372.115320
J.R. Larus, Loop-level parallelism in numeric and symbolic programs IEEE Transactions on Parallel and Distributed Systems. ,vol. 4, pp. 812- 826 ,(1993) , 10.1109/71.238302
Zhao-Hui Du, Chu-Cheow Lim, Xiao-Feng Li, Chen Yang, Qingyu Zhao, Tin-Fook Ngai, A cost-driven compilation framework for speculative parallelization of sequential programs Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation - PLDI '04. ,vol. 39, pp. 71- 81 ,(2004) , 10.1145/996841.996852
Vugranam C. Sreedhar, Guang R. Gao, Yong-Fong Lee, Identifying loops using DJ graphs ACM Transactions on Programming Languages and Systems. ,vol. 18, pp. 649- 658 ,(1996) , 10.1145/236114.236115
Sean Rul, Hans Vandierendonck, Koen De Bosschere, Towards automatic program partitioning Proceedings of the 6th ACM conference on Computing frontiers - CF '09. pp. 89- 98 ,(2009) , 10.1145/1531743.1531759
ALAIN DARTE, FRÉDÉRIC VIVIEN*†, ON THE OPTIMALITY OF ALLEN AND KENNEDY'S ALGORITHM FOR PARALLELISM EXTRACTION IN NESTED LOOPS Parallel Algorithms and Applications. ,vol. 12, pp. 83- 112 ,(1997) , 10.1080/01495739708941417