作者: William Jalby , Stéphane Zuckerman , Albert Cohen , Patrick Carribault
DOI:
关键词:
摘要: A number of computational applications lack instruction-level parallelism. This loss is particularly acute on sequences dependent instructions wide-issue or deeply pipelined architectures. We consider four real from biology, cryptanalysis, and data compression. These are characterized by long instructions, irregular control-flow intricate scalar memory dependence patterns. While these benchmarks exhibit good locality branch-predictability, state-of-the-art compiler optimizations fail to exploit much paper shows that major performance gains possible such applications, through a loop transformation called deep jam. reshapes the program facilitate extraction independent computations classical back-end techniques. Deep jam combines accurate analysis control speculation, with generalized form recursive, multi-variant unroll-and-jam; it brings together across structures, removing memory-based dependences array renaming. optimization contributes fine-grain parallelism in applications. propose feedback-directed algorithm, selecting jamming strategy, function architecture application characteristics.