Precision–Energy–Throughput Scaling of Generic Matrix Multiplication and Convolution Kernels via Linear Projections

作者: Mohammad Ashraful Anam , Paul N. Whatmough , Yiannis Andreopoulos

DOI: 10.1109/TCSVT.2014.2321071

关键词:

摘要: Generic matrix multiplication (GEMM) and con- volution (CONV)/cross-correlation kernels often constitute the bulk of the compute- memory-intensive processing within image/audio recognition matching systems. We propose a novel method to scale energy throughput of GEMM CONV for such error-tolerant multimedia applications by adjusting precision computation. Our technique employs linear projections input or signal data during top-level GEMM blocking and reordering. The kernel then uses projected inputs results are accumulated to form final outputs. Throughput scaling takes place changing number computed by each kernel, which in turn produces approximate results, i.e., changes performed Results derived from a voltage- frequency-scaled ARM Cortex A15 processor running face music-matching algorithms demonstrate that proposed approach allows for a 280%–440% increase throug hput 75%– 80% decrease consumption against optimized GEMM without any impact on obtained recognition or accuracy. Even higher gains can be obtained, if one is willing tolerate some reduction the accuracy applications

参考文章(49)
Steven Hand, Derek G. Murray, Spread-spectrum computation hot topics in system dependability. pp. 5- 5 ,(2008)
Cordelia Schmid, Roger Mohr, Christian Bauckhage, Evaluation of Interest Point Detectors International Journal of Computer Vision. ,vol. 37, pp. 151- 172 ,(2000) , 10.1023/A:1008199403446
Zheng Zhu, Ingemar J. Cox, Mark Levene, Ranked-Listed or Categorized Results in IR: 2 Is Better Than 1 applications of natural language to data bases. pp. 111- 123 ,(2008) , 10.1007/978-3-540-69858-6_12
S Hamid Nawab, Alan V Oppenheim, Anantha P Chandrakasan, Joseph M Winograd, Jeffrey T Ludwig, None, Approximate Signal Processing signal processing systems. ,vol. 15, pp. 177- 200 ,(1997) , 10.1023/A:1007986707921
Embedded Computer Vision Embedded Computer Vision. ,(2008) , 10.1007/978-1-84800-304-0
Kosmas Petridis, Dionysios Anastasopoulos, Carsten Saathoff, Norman Timmermann, Yiannis Kompatsiaris, Steffen Staab, M-OntoMat-Annotizer: Image Annotation Linking Ontologies and Multimedia Low-Level Features Lecture Notes in Computer Science. pp. 633- 640 ,(2006) , 10.1007/11893011_80
Thomas Y. Yeh, Glenn Reinman, Sanjay J. Patel, Petros Faloutsos, Fool me twice ACM Transactions on Graphics. ,vol. 29, pp. 1- 11 ,(2009) , 10.1145/1640443.1640448
John Sartori, Rakesh Kumar, Branch and Data Herding: Reducing Control and Memory Divergence for Error-Tolerant GPU Applications IEEE Transactions on Multimedia. ,vol. 15, pp. 279- 290 ,(2013) , 10.1109/TMM.2012.2232647
Richard A. Newcombe, Andrew Fitzgibbon, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J. Davison, Pushmeet Kohi, Jamie Shotton, Steve Hodges, KinectFusion: Real-time dense surface mapping and tracking international symposium on mixed and augmented reality. pp. 127- 136 ,(2011) , 10.1109/ISMAR.2011.6092378
Paul N. Whatmough, Shidhartha Das, David M. Bull, Izzat Darwazeh, Selective time borrowing for DSP pipelines with hybrid voltage control loop asia and south pacific design automation conference. pp. 763- 768 ,(2012) , 10.1109/ASPDAC.2012.6165057