Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach

DOI:

关键词: Computer hardware 、 Chip 、 Pipeline (computing) 、 Processor design 、 Bandwidth (signal processing) 、 Petascale computing 、 Clock rate 、 Multi-core processor 、 Computer science 、 Limit (music)

摘要: Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach George Bosilca Thomas Herault Aurelien Bouteiller Piotr Luszczek Anthony Danalis Jack J. Dongarra January 24, 2012 Introduction and Motivation Among the various factors that drive momentous changes occurring in design of microprocessors high end systems [1], three stand out as especially notable: 1. number transistors per chip will continue current trend, i.e. double roughly every 18 months, while speed processor clocks cease to in- crease; 2. physical limit bandwidth CPUs pins is becoming near-term reality; 3. strong drift toward hybrid/heterogeneous for petascale (and larger) taking place. While first two involve fundamental limitations technology trends are unlikely overcome near term, third an obvious consequence two, combined economic necessity using many thousands computational units scale up larger systems. More slower require multicore designs increased par- allelism. The laws traditional – increasing transistor density, speeding clock rate, lowering voltage have now been stopped by set barriers: excess heat produced, too much power consumed, energy leaked, useful signal noise. Multicore natural evolu- tionary response this situation. By putting multiple cores single die, architects can previous limitations, increase num- ber gates without densities. However, since production means frequencies cannot be further increased, deep-and-narrow pipeline models tend recede shallow-and-wide become norm. Moreover, despite similarities, processors not equiva- lent multiple-CPUs or SMPs. Multiple same share

escholarship.org 本地加速

escholarship.org LINK 下载加速

netlib.org LINK 下载加速

参考文章(36)

Herb Sutter, The Free Lunch Is Over A Fundamental Turn Toward Concurrency in Software ,(2013)

J. Dongarra, L. S. Blackford, J. Demmel, A. Petitet, I. Dhillon, E. D'Azevedo, R. C. Whaley, G. Henry, K. Stanley, J. Choi, S. Hammarling, A. Cleary, D. Walker, ScaLAPACK Users' Guide ,(1987)

John A Sharp, None, Data flow computing: theory and practice Ablex Publishing Corp.. ,(1992)

Edward Grady Coffman, Peter J Denning, None, Operating Systems Theory Prentice Hall Professional Technical Reference. ,(1973)

Azzam Haidar, Hatem Ltaief, Asim YarKhan, Jack Dongarra, Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures Concurrency and Computation: Practice and Experience. ,vol. 24, pp. 305- 321 ,(2011) , 10.1002/CPE.1829

Alfredo Buttari, Jack Dongarra, Jakub Kurzak, Julien Langou, Piotr Luszczek, Stanimire Tomov, The impact of multicore on math software parallel computing. pp. 1- 10 ,(2006) , 10.1007/978-3-540-75755-9_1

Allen D. Malony, Wolfgang E. Nagel, The open trace format (OTF) and open tracing for HPC conference on high performance computing (supercomputing). pp. 24- ,(2006) , 10.1145/1188455.1188480

J.L. Hess, A.M.O. Smith, Calculation of potential flow about arbitrary bodies Progress in Aerospace Sciences. ,vol. 8, pp. 1- 138 ,(1967) , 10.1016/0376-0421(67)90003-6

Ernie Chan, Field G. Van Zee, Paolo Bientinesi, Enrique S. Quintana-Orti, Gregorio Quintana-Orti, Robert van de Geijn, SuperMatrix Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming - PPoPP '08. pp. 123- 132 ,(2008) , 10.1145/1345206.1345227

10.

G.W. Stewart, The decompositional approach to matrix computation computational science and engineering. ,vol. 2, pp. 50- 59 ,(2000) , 10.1109/5992.814658

Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach

来源期刊

我的账户

Dense Linear Algebra on Distributed Heterogeneous Hardware with a Symbolic DAG Approach

来源期刊

相似文章 10

我的账户