Deep Jam: Conversion of Coarse-Grain Parallelism to Fine-Grain and Vector Parallelism.

作者: William Jalby , Stéphane Zuckerman , Albert Cohen , Patrick Carribault

DOI:

关键词:

摘要: A number of computational applications lack instruction-level parallelism. This loss is particularly acute on sequences dependent instructions wide-issue or deeply pipelined architectures. We consider four real from biology, cryptanalysis, and data compression. These are characterized by long instructions, irregular control-flow intricate scalar memory dependence patterns. While these benchmarks exhibit good locality branch-predictability, state-of-the-art compiler optimizations fail to exploit much paper shows that major performance gains possible such applications, through a loop transformation called deep jam. reshapes the program facilitate extraction independent computations classical back-end techniques. Deep jam combines accurate analysis control speculation, with generalized form recursive, multi-variant unroll-and-jam; it brings together across structures, removing memory-based dependences array renaming. optimization contributes fine-grain parallelism in applications. propose feedback-directed algorithm, selecting jamming strategy, function architecture application characteristics.

参考文章(45)
Silvius Rus, Lawrence Rauchwerger, Jay Hoeflinger, Hybrid analysis: static & dynamic memory reference analysis International Journal of Parallel Programming. ,vol. 31, pp. 251- 283 ,(2003) , 10.1023/A:1024597010150
Keith D. Cooper, Devika Subramanian, Linda Torczon, Adaptive Optimizing Compilers for the 21st Century The Journal of Supercomputing. ,vol. 23, pp. 7- 22 ,(2002) , 10.1023/A:1015729001611
Jian Wang, Guang R. Gao, Pipelining-dovetailing: A transformation to enhance software pipelining for nested loops Lecture Notes in Computer Science. pp. 1- 16 ,(1996) , 10.1007/3-540-61053-7_49
Heikki Hyyrö, Gonzalo Navarro, Faster Bit-Parallel Approximate String Matching combinatorial pattern matching. pp. 203- 224 ,(2002) , 10.1007/3-540-45452-7_18
Michael D. Smith, Overcoming the challenges to feedback-directed optimization Sigplan Notices. ,vol. 35, pp. 1- 11 ,(2000)
Martin Griebl, Jean-François Collard, Generation of Synchronous Code for Automatic Parallelization of while Loops european conference on parallel processing. pp. 315- 326 ,(1995) , 10.1007/BFB0020474
Ken Kennedy, Kathryn S. McKinley, Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution languages and compilers for parallel computing. pp. 301- 320 ,(1993) , 10.1007/3-540-57659-2_18
José González, Antonio González, The potential of data value speculation to boost ILP international conference on supercomputing. pp. 21- 28 ,(1998) , 10.1145/277830.277840
Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, F. Kenneth Zadeck, Efficiently computing static single assignment form and the control dependence graph ACM Transactions on Programming Languages and Systems. ,vol. 13, pp. 451- 490 ,(1991) , 10.1145/115372.115320