Deep Jam: Conversion of Coarse-Grain Parallelism to Fine-Grain and Vector Parallelism.

作者： William Jalby , Stéphane Zuckerman , Albert Cohen , Patrick Carribault

DOI:

关键词:

摘要: A number of computational applications lack instruction-level parallelism. This loss is particularly acute on sequences dependent instructions wide-issue or deeply pipelined architectures. We consider four real from biology, cryptanalysis, and data compression. These are characterized by long instructions, irregular control-flow intricate scalar memory dependence patterns. While these benchmarks exhibit good locality branch-predictability, state-of-the-art compiler optimizations fail to exploit much paper shows that major performance gains possible such applications, through a loop transformation called deep jam. reshapes the program facilitate extraction independent computations classical back-end techniques. Deep jam combines accurate analysis control speculation, with generalized form recursive, multi-variant unroll-and-jam; it brings together across structures, removing memory-based dependences array renaming. optimization contributes fine-grain parallelism in applications. propose feedback-directed algorithm, selecting jamming strategy, function architecture application characteristics.

archives-ouvertes.fr 本地加速

uni-trier.de 本地加速

jilp.org PDF 下载加速

参考文章(45)

Silvius Rus, Lawrence Rauchwerger, Jay Hoeflinger, Hybrid analysis: static & dynamic memory reference analysis International Journal of Parallel Programming. ,vol. 31, pp. 251- 283 ,(2003) , 10.1023/A:1024597010150

Allen, Optimizing Compilers for Modern Architectures ,(2004)

Keith D. Cooper, Devika Subramanian, Linda Torczon, Adaptive Optimizing Compilers for the 21st Century The Journal of Supercomputing. ,vol. 23, pp. 7- 22 ,(2002) , 10.1023/A:1015729001611

Jian Wang, Guang R. Gao, Pipelining-dovetailing: A transformation to enhance software pipelining for nested loops Lecture Notes in Computer Science. pp. 1- 16 ,(1996) , 10.1007/3-540-61053-7_49

Heikki Hyyrö, Gonzalo Navarro, Faster Bit-Parallel Approximate String Matching combinatorial pattern matching. pp. 203- 224 ,(2002) , 10.1007/3-540-45452-7_18

Michael D. Smith, Overcoming the challenges to feedback-directed optimization Sigplan Notices. ,vol. 35, pp. 1- 11 ,(2000)

Martin Griebl, Jean-François Collard, Generation of Synchronous Code for Automatic Parallelization of while Loops european conference on parallel processing. pp. 315- 326 ,(1995) , 10.1007/BFB0020474

Ken Kennedy, Kathryn S. McKinley, Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution languages and compilers for parallel computing. pp. 301- 320 ,(1993) , 10.1007/3-540-57659-2_18

José González, Antonio González, The potential of data value speculation to boost ILP international conference on supercomputing. pp. 21- 28 ,(1998) , 10.1145/277830.277840

10.

Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, F. Kenneth Zadeck, Efficiently computing static single assignment form and the control dependence graph ACM Transactions on Programming Languages and Systems. ,vol. 13, pp. 451- 490 ,(1991) , 10.1145/115372.115320

Deep Jam: Conversion of Coarse-Grain Parallelism to Fine-Grain and Vector Parallelism.

来源期刊

我的账户

Deep Jam: Conversion of Coarse-Grain Parallelism to Fine-Grain and Vector Parallelism.

来源期刊

相似文章 0

我的账户