Precision–Energy–Throughput Scaling of Generic Matrix Multiplication and Convolution Kernels via Linear Projections

作者： Mohammad Ashraful Anam , Paul N. Whatmough , Yiannis Andreopoulos

DOI: 10.1109/TCSVT.2014.2321071

关键词:

摘要: Generic matrix multiplication (GEMM) and con- volution (CONV)/cross-correlation kernels often constitute the bulk of the compute- memory-intensive processing within image/audio recognition matching systems. We propose a novel method to scale energy throughput of GEMM CONV for such error-tolerant multimedia applications by adjusting precision computation. Our technique employs linear projections input or signal data during top-level GEMM blocking and reordering. The kernel then uses projected inputs results are accumulated to form final outputs. Throughput scaling takes place changing number computed by each kernel, which in turn produces approximate results, i.e., changes performed Results derived from a voltage- frequency-scaled ARM Cortex A15 processor running face music-matching algorithms demonstrate that proposed approach allows for a 280%–440% increase throug hput 75%– 80% decrease consumption against optimized GEMM without any impact on obtained recognition or accuracy. Even higher gains can be obtained, if one is willing tolerate some reduction the accuracy applications

uni-trier.de PDF 下载加速

sci-hub.se PDF 下载加速

参考文章(49)

Steven Hand, Derek G. Murray, Spread-spectrum computation hot topics in system dependability. pp. 5- 5 ,(2008)

Cordelia Schmid, Roger Mohr, Christian Bauckhage, Evaluation of Interest Point Detectors International Journal of Computer Vision. ,vol. 37, pp. 151- 172 ,(2000) , 10.1023/A:1008199403446

Zheng Zhu, Ingemar J. Cox, Mark Levene, Ranked-Listed or Categorized Results in IR: 2 Is Better Than 1 applications of natural language to data bases. pp. 111- 123 ,(2008) , 10.1007/978-3-540-69858-6_12

S Hamid Nawab, Alan V Oppenheim, Anantha P Chandrakasan, Joseph M Winograd, Jeffrey T Ludwig, None, Approximate Signal Processing signal processing systems. ,vol. 15, pp. 177- 200 ,(1997) , 10.1023/A:1007986707921

Embedded Computer Vision Embedded Computer Vision. ,(2008) , 10.1007/978-1-84800-304-0

Kosmas Petridis, Dionysios Anastasopoulos, Carsten Saathoff, Norman Timmermann, Yiannis Kompatsiaris, Steffen Staab, M-OntoMat-Annotizer: Image Annotation Linking Ontologies and Multimedia Low-Level Features Lecture Notes in Computer Science. pp. 633- 640 ,(2006) , 10.1007/11893011_80

Thomas Y. Yeh, Glenn Reinman, Sanjay J. Patel, Petros Faloutsos, Fool me twice ACM Transactions on Graphics. ,vol. 29, pp. 1- 11 ,(2009) , 10.1145/1640443.1640448

John Sartori, Rakesh Kumar, Branch and Data Herding: Reducing Control and Memory Divergence for Error-Tolerant GPU Applications IEEE Transactions on Multimedia. ,vol. 15, pp. 279- 290 ,(2013) , 10.1109/TMM.2012.2232647

Richard A. Newcombe, Andrew Fitzgibbon, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J. Davison, Pushmeet Kohi, Jamie Shotton, Steve Hodges, KinectFusion: Real-time dense surface mapping and tracking international symposium on mixed and augmented reality. pp. 127- 136 ,(2011) , 10.1109/ISMAR.2011.6092378

10.

Paul N. Whatmough, Shidhartha Das, David M. Bull, Izzat Darwazeh, Selective time borrowing for DSP pipelines with hybrid voltage control loop asia and south pacific design automation conference. pp. 763- 768 ,(2012) , 10.1109/ASPDAC.2012.6165057

Precision–Energy–Throughput Scaling of Generic Matrix Multiplication and Convolution Kernels via Linear Projections

来源期刊

我的账户

Precision–Energy–Throughput Scaling of Generic Matrix Multiplication and Convolution Kernels via Linear Projections

来源期刊

相似文章 3

Core Failure Mitigation in Integer Sum-of-Product Computations on Cloud Computing Systems

Hypergraph Based Minimum Arborescence Algorithm for the Optimization and Reoptimization of Multiple Constant Multiplications

Fast 2D Convolutions and Cross-Correlations Using Scalable Architectures

我的账户