作者: Kyung-Wook Shin , Heung-Woo Jeon , Yong-Seum Kang
DOI: 10.1109/ISCAS.1994.409193
关键词:
摘要: This paper describes an efficient array algorithm for parallel computation of vector-radix two-dimensional (2-D) discrete cosine transform (VR-DCT), and its VLSI implementation. By mapping the 2-D VR-DCT onto a processing elements (PEs), DCT is efficiently computed with high concurrency local data exchanges between PEs. The proposed features architectural modularity, regularity locality, so that it very suitable realization. Also, no transposition memory required. It has time complexity O(N+N/sub NZD//spl middot/log/sub 2/N) (N/spl times/N) DCT, where N/sub NZD/ number non-zero digits in canonic-signed digit (CSD) representation kernel. Based on algorithm, processor (8/spl times/8) designed using 1.5 /spl mu/m double metal CMOS technology. From simulation results, estimated (with NZD/=4) can be about 0.88 mu/sec at 50 MHz clock frequency, resulting throughput rate 72 Mega pixels/sec. >