作者: David Moloney , Dermot Geraghty , Colm McSweeney , Ciaran McElroy
DOI: 10.1007/11587514_9
关键词:
摘要: A streaming floating-point sparse-matrix compression which forms a key element of an accelerator for finite-element and other linear algebra applications is described. The proposed architecture seeks to accelerate the performance-limiting Sparse Matrix-Vector Multiplication (SMVM) operation at heart through combination dedicated datapath optimized these with data-compression decompression unit increases effective memory bandwidth seen by datapath. format uses variable length entries contain opcode optionally address and/or non-zero entry. System simulations performed using cycle-accurate C++ architectural model database over 400 large symmetric unsymmetric matrices containing up 20M elements (and total 226M non-zeroes) demonstrate that 20% average performance improvement can be achieved compared published work, modest increase in hardware resources.