Parallel For Loops on Heterogeneous Resources

作者: Frederick Edward Weber

DOI:

关键词:

摘要: In recent years, Graphics Processing Units (GPUs) have piqued the interest of researchers in scientific computing. Their immense floating point throughput and massive parallelism make them ideal for not just graphical applications, but many general algorithms as well. Load balancing applications taking advantage all computational resources a machine is difficult challenge, especially when are heterogeneous. This dissertation presents clUtil library, which vastly simplifies developing OpenCL heterogeneous systems. The core focus this lies clUtil’s ParallelFor construct our novel PINA scheduler can efficiently load balance work onto multiple GPUs CPUs simultaneously.

参考文章(54)
Gene M. Amdahl, Validity of the single processor approach to achieving large scale computing capabilities Proceedings of the April 18-20, 1967, spring joint computer conference on - AFIPS '67 (Spring). pp. 483- 485 ,(1967) , 10.1145/1465482.1465560
T.L. Casavant, J.G. Kuhl, A taxonomy of scheduling in general-purpose distributed computing systems IEEE Transactions on Software Engineering. ,vol. 14, pp. 141- 154 ,(1988) , 10.1109/32.4634
Louis-Noël Pouchet, Uday Bondhugula, Cédric Bastoul, Albert Cohen, J. Ramanujam, P. Sadayappan, Nicolas Vasilache, Loop transformations: convexity, pruning and optimization symposium on principles of programming languages. ,vol. 46, pp. 549- 562 ,(2011) , 10.1145/1925844.1926449
Kuan-Wei Cheng, Chao-Tung Yang, Chuan-Lin Lai, Shun-Chyi Chang, A parallel loop self-scheduling on grid computing environments international symposium on parallel architectures algorithms and networks. pp. 409- 414 ,(2004) , 10.1109/ISPAN.2004.1300514
Chao-Chin Wu, Liang-Tsung Huang, Lien-Fu Lai, Ming-Lung Chen, Enhanced Parallel Loop Self-Scheduling for Heterogeneous Multi-core Cluster Systems 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks. pp. 568- 573 ,(2009) , 10.1109/I-SPAN.2009.38
Muhsen Owaida, Nikolaos Bellas, Konstantis Daloukas, Christos D. Antonopoulos, Synthesis of Platform Architectures from OpenCL Programs field-programmable custom computing machines. pp. 186- 193 ,(2011) , 10.1109/FCCM.2011.19
M. Cierniak, Wei Li, M.J. Zaki, Loop scheduling for heterogeneity high performance distributed computing. pp. 78- 85 ,(1995) , 10.1109/HPDC.1995.518697
Wen-mei Hwu, None, GPU Computing Gems Jade Edition Morgan Kaufmann Publishers Inc.. ,(2011) , 10.1016/C2010-0-68654-8
Patrick P. Gelsinger, Power play Communications of The ACM. ,vol. 45, pp. 106- ,(2002) , 10.1145/508448.508477