EXOCHI

作者: Perry H. Wang , Jamison D. Collins , Gautham N. Chinya , Hong Jiang , Xinmin Tian

DOI: 10.1145/1250734.1250753

关键词:

摘要: Future mainstream microprocessors will likely integrate specialized accelerators, such as GPUs, onto a single die to achieve better performance and power efficiency. However, it remains keen challenge program heterogeneous multicore platform, since these accelerators feature ISAs functionality that are significantly different from the general purpose CPU cores. In this paper, we present EXOCHI: (1) Exoskeleton Sequencer(EXO), an architecture represent acceleratorsas ISA-based MIMD resources, shared virtual memory multithreaded execution model tightly couples accelerator cores with generalpurpose cores, (2) C for Heterogeneous Integration(CHI), integrated C/C++ programming environment supports accelerator-specific inline assembly domain-specific languages. The CHI compiler extends OpenMP pragma multithreading programming, produces fat binary code sections corresponding instruction sets. runtime can judiciously spread parallel computation across optimize power.We have prototyped EXO on physical platform consisting of Intel® Core™ 2 Duo processor 8-core 32-thread Graphics Media Accelerator X3000. addition, implemented C++ Compiler, toolset, debugger. On prototype system, enhanced suite production-quality media kernels video image processing utilize through interface, achieving significant speedup (1.41X to10.97X) over IA32 alone.

参考文章(19)
Dan Zhang, Zeng-Zhi Li, Hong Song, Long Liu, A Programming Model for an Embedded Media Processing Architecture Lecture Notes in Computer Science. pp. 251- 261 ,(2005) , 10.1007/11512622_27
Michael McCool, Stefanus Du Toit, Metaprogramming GPUs with Sh ,(2004)
William Thies, Michal Karczmarek, Saman Amarasinghe, StreamIt: A Language for Streaming Applications compiler construction. pp. 179- 196 ,(2002) , 10.1007/3-540-45937-5_14
Francois Labonte, Peter Mattson, William Thies, Ian Buck, Christos Kozyrakis, Mark Horowitz, None, The Stream Virtual Machine international conference on parallel architectures and compilation techniques. pp. 267- 277 ,(2004) , 10.5555/1025127.1026015
Ian Buck, Tim Foley, Daniel Horn, Jeremy Sugerman, Kayvon Fatahalian, Mike Houston, Pat Hanrahan, Brook for GPUs ACM Transactions on Graphics. ,vol. 23, pp. 777- 786 ,(2004) , 10.1145/1015706.1015800
R.E. Gonzalez, A Software-Configurable Processor Architecture IEEE Micro. ,vol. 26, pp. 42- 51 ,(2006) , 10.1109/MM.2006.85
Michael D. McCool, Kevin Wadleigh, Brent Henderson, Hsin-Ying Lin, Performance evaluation of GPUs using the RapidMind development platform conference on high performance computing (supercomputing). pp. 181- ,(2006) , 10.1145/1188455.1188642
U.J. Kapasi, S. Rixner, W.J. Dally, B. Khailany, Jung Ho Ahn, P. Mattson, J.D. Owens, Programmable stream processors IEEE Computer. ,vol. 36, pp. 54- 62 ,(2003) , 10.1109/MC.2003.1220582
Sanjiv Shah, Grant Haab, Paul Petersen, Joe Throop, Flexible control structures for parallelism in OpenMP Concurrency and Computation: Practice and Experience. ,vol. 12, pp. 1219- 1239 ,(2000) , 10.1002/1096-9128(200010)12:12<1219::AID-CPE530>3.0.CO;2-0