Statistics Inspired Hardware Architectures For Image And Video Processing

作者: Bilal Muhammad

DOI:

关键词:

摘要: Conventional digital arithmetic circuits are designed to operate on a specified range of operand magnitudes.The architecture these is typically developed improve the area time characteristics and obtain an energy efficient implementation operating at desired throughput.Moreover, required provide full precision for dynamic operands speed worst case.Albeit correctness results ensured, input data statistics not taken into account. Resultantly, redundant computations performed since information content sequence in certain applications, such as image video processing, known be much less than simple binary descriptions used by conventional circuits. However, run identification exploitation latent redundancy non trivial owing diverse statistical nature data. Although efforts have been made past design low power architectures exploiting patterns magnitudes operands, few attempts logic processing efficiency harnessing properties inputs.It seems conducive application specific hardware that explicitly incorporates while computing consequently saves precious computation cycles or resources which otherwise wasted computations. This thesis proposes approaches utilize inherent applications achieve most economical tradeoffs between resources, precision.Specifically, modifications enhancements approaches, namely Distributed Arithmetic, Sub-Expression Sharing, Fast FIR parallel filtering Approximate Processing reported further enhance their efficiency.Conventional Arithmetic approach trade-off with has modified include Memoization based Look Up Tables storage partial from computations.This modification harnesses bit level redundancies leads decrease average requiring proportionately lesser increase resource requirements.The proposed implement Color Space Conversion module incorporation Instruction Set Architecture enhancement open-source Intellectual Property Core OR1200 32-bit processor. Similarly, Sharing techniques complexity structures identify entropy portions reduced precision.The ensuing depict negligible loss output when process saving higher proportion. Merits incorporating illustrated through statistics-inspired Sum Squared Error Normalized Cross Correlation error metric template matching Motion Estimation Disparity Estimation.The designs high appropriate perform better approximate use brute force quantization well reduce complexity.The this either lower using comparable cost minimal impact compared approaches.Efficacy employing signal shown proofs savings Field Programmable Gate Arrays implementations performance several real-world applications. 70% Squarer 50% reduction times Modified achieved approach.

参考文章(112)
Junghwan Choi, Jinhwan Jeon, Kiyoung Choi, Power minimization of functional units partially guarded computation Proceedings of the 2000 international symposium on Low power electronics and design - ISLPED '00. pp. 131- 136 ,(2000) , 10.1145/344166.344549
Jiun-Ping Wang, Shiann-Rong Kuang, Shish-Chang Liang, High-Accuracy Fixed-Width Modified Booth Multipliers for Lossy Applications IEEE Transactions on Very Large Scale Integration Systems. ,vol. 19, pp. 52- 60 ,(2011) , 10.1109/TVLSI.2009.2032289
Ko-Cheung Hui, Wan-Chi Siu, Extended Analysis of Motion-Compensated Frame Difference for Block-Based Motion Prediction Error IEEE Transactions on Image Processing. ,vol. 16, pp. 1232- 1245 ,(2007) , 10.1109/TIP.2007.894263
John Lach, Jiawei Huang, Exploring the fidelity-efficiency design space using imprecise arithmetic asia and south pacific design automation conference. pp. 579- 584 ,(2011) , 10.5555/1950815.1950931
J.M.P. Langlois, D. Al-Khalili, Carry-free approximate squaring functions with O(n) complexity and O(1) delay IEEE Transactions on Circuits and Systems Ii-express Briefs. ,vol. 53, pp. 374- 378 ,(2006) , 10.1109/TCSII.2006.873364
S.M. Nowick, K.Y. Yun, P.A. Beerel, A.E. Dooply, Speculative completion for the design of high-performance asynchronous dynamic adders international symposium on advanced research in asynchronous circuits and systems. pp. 210- 223 ,(1997) , 10.1109/ASYNC.1997.587176
Levent Aksoy, Eduardo da Costa, Paulo Flores, JosÉ Monteiro, Exact and Approximate Algorithms for the Optimization of Area and Delay in Multiple Constant Multiplications IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. ,vol. 27, pp. 1013- 1026 ,(2008) , 10.1109/TCAD.2008.923242
C. Cheng, K.K. Parhi, Hardware efficient fast parallel FIR filter structures based on iterated short convolution IEEE Transactions on Circuits and Systems. ,vol. 51, pp. 1492- 1500 ,(2004) , 10.1109/TCSI.2004.832784
S. Das, D. Roberts, S. Lee, S. Pant, D. Blaauw, T. Austin, K. Flautner, T. Mudge, A Self-Tuning DVS Processor Using Delay-Error Detection and Correction IEEE Journal of Solid-State Circuits. ,vol. 41, pp. 792- 804 ,(2006) , 10.1109/JSSC.2006.870912
Shyh-Jye Jon, Hui-Hsuan Wang, Fixed-width multiplier for DSP application international conference on computer design. pp. 318- 322 ,(2000) , 10.1109/ICCD.2000.878302