作者: Jaegeun Oh , Seok Joong Hwang , Huong Giang Nguyen , Areum Kim , Seon Wook Kim
DOI: 10.4218/ETRIJ.08.0107.0343
关键词: Computer science 、 Parallel computing 、 Pipeline (computing) 、 Degree of parallelism 、 Task parallelism 、 Lockstep 、 Compiler 、 SIMD 、 Multiprocessing 、 Multithreading
摘要: In most parallel loops of embedded applications, every iteration executes the exact same sequence instructions while manipulating different data. This fact motivates a new compiler-hardware orchestrated execution framework in which all threads share one fetch unit and decode but have their own execution, memory, write-back units. resource sharing enables to execute lockstep with minimal hardware extension compiler support. Our proposed architecture, called multithreaded processor (MLEP), is compromise between single-instruction multiple-data (SIMD) symmetric multithreading/chip multiprocessor (SMT/CMP) solutions. The approach more favorable than typical SIMD terms degree parallelism, range applicability, code generation, can save power chip area SMT/CMP without significant performance degradation. For architecture verification, we extend commercial 32-bit core AE32000C synthesize it on Xilinx FPGA. Compared original our 13.5% faster 2-way MLEP 33.7% 4-way EEMBC benchmarks are automatically parallelized by Intel compiler. Keywords: ILP, TLP, SMT, CMP, MLEP.