A fused hybrid floating-point and fixed-point dot-product for FPGAs

作者: Antonio Roldao Lopes , George A. Constantinides

DOI: 10.1007/978-3-642-12133-3_16

关键词: Dot productFloating pointComputer hardwareLatency (audio)Reduction (complexity)Parallel computingFixed pointAccelerationField-programmable gate arrayComputer scienceClock rate

摘要: Dot-products are one of the essential and recurrent building blocks in scientific computing, often take-up a large proportion acceleration circuitry. The dot-products is very well suited for Field Programmable Gate Arrays (FPGAs) since these devices can be configured to employ wide parallelism, deep pipelining exploit highly efficient datapaths. In this paper we present dot-product implementation which operates using hybrid floating-point fixed-point number system. This design receives inputs, generates output. Internally it makes use configurable word-length internal representation tuned match desired accuracy. Results high-end Xilinx FPGA an order 150 demonstrate that, equivalent accuracy metrics, possible utilize 3.8 times fewer resources, operate at 1.62 faster clock frequency, achieve significant reduction latency when compared direct core based dot-product. Combining results utilizing spare resources instantiate more units parallel, overall speed-up least 5 times.

参考文章(8)
K. A. Mohamed Junaid, G. Ravindrann, FPGA Accelerator For Medical Image Compression System Springer, Berlin, Heidelberg. pp. 396- 399 ,(2007) , 10.1007/978-3-540-68017-8_100
K.D. Underwood, K.S. Hemmert, Closing the gap: CPU and FPGA trends in sustainable floating-point BLAS performance field-programmable custom computing machines. pp. 219- 228 ,(2004) , 10.1109/FCCM.2004.21
Hani H. Saleh, Earl E. Swartzlander, A floating-point fused dot-product unit international conference on computer design. pp. 427- 431 ,(2008) , 10.1109/ICCD.2008.4751896
Martin Langhammer, Floating point datapath synthesis for FPGAs 2008 International Conference on Field Programmable Logic and Applications. pp. 355- 360 ,(2008) , 10.1109/FPL.2008.4629963
Ling Zhuo, Gerald R. Morris, Viktor K. Prasanna, High-Performance Reduction Circuits Using Deeply Pipelined Operators on FPGAs IEEE Transactions on Parallel and Distributed Systems. ,vol. 18, pp. 1377- 1392 ,(2007) , 10.1109/TPDS.2007.1068
Antonio Roldao Lopes, George A. Constantinides, A High Throughput FPGA-Based Floating Point Conjugate Gradient Implementation applied reconfigurable computing. pp. 75- 86 ,(2008) , 10.1007/978-3-540-78610-8_10
David Boland, George A. Constantinides, An FPGA-based implementation of the MINRES algorithm field-programmable logic and applications. pp. 379- 384 ,(2008) , 10.1109/FPL.2008.4629967
Ansi Ieee, IEEE Standard for Binary Floating Point Arithmetic Std 754-1985 ed.. ,(1985)