作者: Roman Lysecky , Frank Vahid
关键词:
摘要: In previous work, we showed the benefits and feasibility of having a processor dynamically partition its executing software such that critical kernels are transparently partitioned to execute as hardware coprocessor on configurable logic - an approach call warp processing. The place route step is most computationally intensive part hardware/software partitioning, normally running for many minutes or hours powerful desktop processors. contrast, dynamic partitioning requires in just seconds lean embedded processor. We have therefore designed architecture specifically partitioning. Through experiments with popular benchmarks, show by focusing goal kernel speedup when designing FPGA architecture, rather than more general ASIC prototyping, can perform our 50 times faster, using 10,000 less data memory, 1,000 code commercial tools mapping logic. Yet, obtain speedups (2x average, much 4x) energy savings (33% up 74%) even one loop, which comparable fabrics. Thus, represents good candidate platforms will support enables ultra-fast fast design general.