作者: Asim YarKhan
DOI:
关键词: Supercomputer architecture 、 Parallel computing 、 Distributed memory 、 Data diffusion machine 、 Interleaved memory 、 Dataflow 、 Distributed shared memory 、 Computer science 、 Uniform memory access 、 Shared memory
摘要: Multicore architectures with high core counts have come to dominate the world of performance computing, from shared memory machines largest distributed clusters. The multicore route increased has a simpler design and better power e ciency than traditional approach increasing processor frequencies. But, standard programming techniques are not well adapted this change in computer architecture design. In work, we study use dynamic runtime environments executing data driven applications as solution architectures. goals our productivity, scalability performance. We demonstrate productivity by defining simple interface express code. Our experimentally shown be scalable give competitive on large machines. This work is linear algebra algorithms, where state-of-the-art libraries (e.g., LAPACK ScaLAPACK) using fork-join or block-synchronous execution style do available resources most cient manner. Research reformulated these algorithms tasks acting tiles data, dependency relationships between tasks. results task-based DAG for which can executed via asynchronous data-driven paths analogous dataflow execution. an API environment that ciently executes serially presented tile based algorithms. used enable deliver state-ofthe-art commercial research libraries.