Dynamic Task Execution on Shared and Distributed Memory Architectures

作者: Asim YarKhan

DOI:

关键词: Supercomputer architectureParallel computingDistributed memoryData diffusion machineInterleaved memoryDataflowDistributed shared memoryComputer scienceUniform memory accessShared memory

摘要: Multicore architectures with high core counts have come to dominate the world of performance computing, from shared memory machines largest distributed clusters. The multicore route increased has a simpler design and better power e ciency than traditional approach increasing processor frequencies. But, standard programming techniques are not well adapted this change in computer architecture design. In work, we study use dynamic runtime environments executing data driven applications as solution architectures. goals our productivity, scalability performance. We demonstrate productivity by defining simple interface express code. Our experimentally shown be scalable give competitive on large machines. This work is linear algebra algorithms, where state-of-the-art libraries (e.g., LAPACK ScaLAPACK) using fork-join or block-synchronous execution style do available resources most cient manner. Research reformulated these algorithms tasks acting tiles data, dependency relationships between tasks. results task-based DAG for which can executed via asynchronous data-driven paths analogous dataflow execution. an API environment that ciently executes serially presented tile based algorithms. used enable deliver state-ofthe-art commercial research libraries.

参考文章(45)
Jakub Kurzak, Rajib Nath, Peng Du, Jack Dongarra, An implementation of the tile QR factorization for a GPU and multiple CPUs parallel computing. pp. 248- 257 ,(2010) , 10.1007/978-3-642-28145-7_25
Jack J. Dongarra, L. S. Blackford, J. Demmel, C. Bischof, Z. Bai, D. Sorensen, A. Greenbaum, E. Anderson, S. Hammarling, J. Du Croz, A. McKenney, LAPACK Users' guide (third ed.) Society for Industrial and Applied Mathematics. ,(1999)
Azzam Haidar, Hatem Ltaief, Asim YarKhan, Jack Dongarra, Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures Concurrency and Computation: Practice and Experience. ,vol. 24, pp. 305- 321 ,(2011) , 10.1002/CPE.1829
Richard Jennings, Gary W. Johnson, LabVIEW Graphical Programming ,(1994)
John Ellson, Emden Gansner, Lefteris Koutsofios, Stephen C. North, Gordon Woodhull, Graphviz: Open source graph drawing tools graph drawing. pp. 483- 484 ,(2001) , 10.1007/3-540-45848-4_57
Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Julien Langou, Piotr Luszczek, Stanimire Tomov, The impact of multicore on math software parallel computing. pp. 1- 10 ,(2006) , 10.1007/978-3-540-75755-9_1
Arthur H. Veen, Dataflow machine architecture ACM Computing Surveys. ,vol. 18, pp. 365- 396 ,(1986) , 10.1145/27633.28055
L. Dagum, R. Menon, OpenMP: an industry standard API for shared-memory programming computational science and engineering. ,vol. 5, pp. 46- 55 ,(1998) , 10.1109/99.660313