作者: Easwaran Raman , David I. August
DOI:
关键词:
摘要: Continuing exponential growth in transistor density and diminishing returns from the increasing count have forced processor manufacturers to pack multiple cores onto a single chip. These processors, known as multi-core generally do not improve performance of single-threaded applications. Automatic parallelization has key role play improving legacy newly written applications this new multi-threaded era. Automatic parallelizations transform code into semantically equivalent by preserving dependences original code. This dissertation proposes two automatic techniques that differ related existing their handling dependences. difference dependence enables proposed outperform techniques. The first technique is parallel-stage decoupled software pipelining (PS-DSWP). PS-DSWP extends pipelined like DSWP allowing certain stages be executed threads. Such parallel execution requires distinguishing inter-iteration loop being parallelized rest The applicability effectiveness further enhanced applying speculation remove some second technique, speculative iteration chunk (Spice), uses value ignore dependences, enabling chunks iterations parallel. Unlike other value-speculation based techniques, Spice speculates only few dynamic instances those Both these are implemented VELOCITY compiler evaluated using Itanium 2 simulator. results geometric mean speedup 2.13 over with five threads on set loops benchmarks. use improves resulting 3.67 six shows 2.01 four Based above experimental qualitative quantitative comparisons demonstrates