Run-time optimizations for replicated dataflows on heterogeneous environments

作者: George Teodoro , Timothy D. R. Hartley , Umit Catalyurek , Renato Ferreira

DOI: 10.1145/1851476.1851479

关键词:

摘要: The increases in multi-core processor parallelism and the flexibility of many-core accelerator processors, such as GPUs, have turned traditional SMP systems into hierarchical, heterogeneous computing environments. Fully exploiting these improvements parallel system design remains an open problem. Moreover, most current tools for development applications hierarchical concentrate on use only a single type (e.g., accelerators) do not coordinate several processors. Here, we show that making all resources can significantly improve application performance. Our approach, which consists optimizing at run-time by efficiently coordinating task execution available processing units is evaluated context replicated dataflow applications. proposed techniques were developed implemented integrated targeting both intra- inter-node parallelism. experimental results with real-world complex biomedical our approach nearly doubles performance GPU-only implementation distributed cluster.

参考文章(38)
Bernardo M. Rocha, Fernando O. Campos, Gernot Plank, Rodrigo W. dos Santos, Manfred Liebmann, Gundolf Haase, Simulations of the electrical activity in the heart with graphic processing units parallel processing and applied mathematics. pp. 439- 448 ,(2009) , 10.1007/978-3-642-14390-8_46
Ching-Hsien Hsu, Tai-Lung Chen, Kuan-Ching Li, Performance effective pre-scheduling strategy for heterogeneous grid systems in the master slave paradigm Future Generation Computer Systems. ,vol. 23, pp. 569- 579 ,(2007) , 10.1016/J.FUTURE.2006.09.007
Evelyn Fix, J. L. Hodges, Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties International Statistical Review. ,vol. 57, pp. 238- ,(1989) , 10.2307/1403797
B. Woods, B. Clymer, J. Saltz, T. Kurc, A Parallel Implementation of 4-Dimensional Haralick Texture Analysis for Disk-Resident Image Datasets conference on high performance computing (supercomputing). pp. 48- 48 ,(2004) , 10.1109/SC.2004.5
Thomas Fahringer, Hans P. Zima, A static parameter based performance prediction tool for parallel programs Proceedings of the 7th international conference on Supercomputing - ICS '93. pp. 207- 219 ,(1993) , 10.1145/165939.165971
Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, Pat Hanrahan, Brook for GPUs ACM Transactions on Graphics. ,vol. 23, pp. 777- 786 ,(2004) , 10.1145/1015706.1015800
Timothy D.R. Hartley, Umit Catalyurek, Antonio Ruiz, Francisco Igual, Rafael Mayo, Manuel Ujaldon, Biomedical image analysis on a cooperative cluster of GPUs and multicores Proceedings of the 22nd annual international conference on Supercomputing - ICS '08. pp. 15- 25 ,(2008) , 10.1145/1375527.1375533
Nina T. Bhatti, Matti A. Hiltunen, Richard D. Schlichting, Wanda Chiu, Coyote ACM Transactions on Computer Systems. ,vol. 16, pp. 321- 366 ,(1998) , 10.1145/292523.292524
Hugues Hoppe, View-dependent refinement of progressive meshes international conference on computer graphics and interactive techniques. ,vol. 31, pp. 189- 198 ,(1997) , 10.1145/258734.258843