作者: Rosa Maria Badia Sala , Jesús José Labarta Mancho
DOI:
关键词:
摘要: Designing parallel codes is hard. One of the most important roadblocks to programming presence data dependencies. These restrict parallelism and, in general, work them around requires complex analysis and leads convoluted solutions that decrease quality code. This thesis proposes a solution incorporates dependencies into model. The model can handle information dynamically find otherwise would be hard find. approach improves both programmability parallelism, thus performance. While this problem has already been solved OpenMP 4 at time publication, research begun before was even being considered for 3. In fact, some contributions have had an influence on taken 4. However, go beyond cover aspects not yet we propose based function-level across disjoint blocks contiguous memory. finding under those constraints simple, it much harder do so over strided possibly partially overlapping sets data. also problem. By doing so, increase range applicability original span OpenMP4 does currently aspect. Finally, present take advantage performance characteristics Non-Uniform Memory Access architectures. Our proposal level require changes It automatically distributes rely migration nor replication. Instead, exclusively scheduling computations. process automatic, tuned through minor code any change Throughout thesis, demonstrate effectiveness benchmarks are either program using other paradigms or different solutions. cases, our perform par better than existing includes implementations available well-known high-performance libraries.