作者: Christian Terboven , Dirk Schmidl , Tim Cramer , Dieter an Mey
DOI: 10.1007/978-3-642-30961-8_14
关键词:
摘要: The introduction of task-level parallelization promises to raise the level abstraction compared thread-centric expression parallelism. However, tasks might exhibit poor performance on NUMA systems if locality cannot be maintained. In contrast traditional OpenMP worksharing constructs for which threads can bound, behavior is much less predetermined by specification and implementations have a high degree freedom implementing task scheduling. Employing different approaches express task-parallelism, namely single-producer parallel-producer patterns with data initialization strategies, we compare quality task-parallel codes architectures. For programmer, propose recipies parallelism allowing preserve while optimizing Our proposals are evaluated reasonably large both important application kernels as well real-world simulation code.