Automatic extraction of pipeline parallelism for embedded heterogeneous multi-core platforms

作者: Daniel Cordes , Michael Engel , Olaf Neugebauer , Peter Marwedel

DOI: 10.5555/2555729.2555733

关键词:

摘要: Automatic parallelization of sequential applications is the key for efficient use and optimization current future embedded multi-core systems. However, existing approaches often fail to achieve balancing tasks running on heterogeneous cores an MPSoC. A reason this insufficient knowledge underlying architecture's performance. In paper, we present a novel approach MPSoCs that combines pipeline loops with about different execution times performance properties. Using Integer Linear Programming, optimal solution respect model used derived implementing well-balanced behavior. We evaluate our using set standard benchmarks compare it two state-of-the-art approaches. For all benchmarks, obtains significantly higher speedups than either MPSoCs.

参考文章(20)
Christian Lengauer, Loop Parallelization in the Polytope Model international conference on concurrency theory. pp. 398- 416 ,(1993) , 10.1007/3-540-57208-2_28
William Thies, Michal Karczmarek, Saman Amarasinghe, StreamIt: A Language for Streaming Applications compiler construction. pp. 179- 196 ,(2002) , 10.1007/3-540-45937-5_14
Daniel Cordes, Andreas Heinig, Peter Marwedel, Arindam Mallik, Automatic Extraction of Pipeline Parallelism for Embedded Software Using Linear Programming 2011 IEEE 17th International Conference on Parallel and Distributed Systems. pp. 699- 706 ,(2011) , 10.1109/ICPADS.2011.31
Easwaran Raman, Guilherme Ottoni, Arun Raman, Matthew J. Bridges, David I. August, Parallel-stage decoupled software pipelining symposium on code generation and optimization. pp. 114- 123 ,(2008) , 10.1145/1356058.1356074
Georgios Tournavitis, Björn Franke, Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information international conference on parallel architectures and compilation techniques. pp. 377- 388 ,(2010) , 10.1145/1854273.1854321
V. Sarkar, Automatic partitioning of a program dependence graph into parallel tasks Ibm Journal of Research and Development. ,vol. 35, pp. 779- 804 ,(1991) , 10.1147/RD.355.0779
Rohit Chandra, Ding-Kai Chen, Robert Cox, Dror E. Maydan, Nenad Nedeljkovic, Jennifer M. Anderson, Data distribution support on distributed shared memory multiprocessors programming language design and implementation. ,vol. 32, pp. 334- 345 ,(1997) , 10.1145/258915.258945
Uday Bondhugula, Albert Hartono, J. Ramanujam, P. Sadayappan, A practical automatic polyhedral parallelizer and locality optimizer Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation - PLDI '08. ,vol. 43, pp. 101- 113 ,(2008) , 10.1145/1375581.1375595
Daniel Cordes, Olaf Neugebauer, Michael Engel, Peter Marwedel, Automatic Extraction of Task-Level Parallelism for Heterogeneous MPSoCs international conference on parallel processing. pp. 950- 959 ,(2013) , 10.1109/ICPP.2013.113
Duo Liu, Zili Shao, Meng Wang, Minyi Guo, Jingling Xue, Optimal loop parallelization for maximizing iteration-level parallelism compilers, architecture, and synthesis for embedded systems. pp. 67- 76 ,(2009) , 10.1145/1629395.1629407