Function level parallelism driven by data dependencies

作者: Sean Rul , Hans Vandierendonck , Koen De Bosschere

DOI: 10.1145/1241601.1241612

关键词: Automatic parallelizationTask parallelismParallel computingData parallelismParallelism (grammar)Theoretical computer scienceInstruction-level parallelismComputer scienceData structureSpeedupImplicit parallelismData-flow analysis

摘要: With the rise of Chip multiprocessors (CMPs), amount parallel computing power will increase significantly in near future. However, most programs are sequential nature and have not been explicitly parallelized, so they cannot exploit these resources. Automatic parallelization sequential, non-regular codes is very hard, as illustrated by lack solutions after more than 30 years research on topic. The question remains if there parallelism that can be detected automatically so, how much is.In this paper, we propose a framework for extracting potential from programs. Applying to teach us present program, but also tells what appropriate construct program is, e.g. pipeline, master/slave work distribution, etc.Our profile-based, implying it safe. It builds two new graph representations profile-data: interprocedural data flow sharing graph. This graphs show data-flow between functions structures facilitating data-flow, respectively.We apply our SPECcpu2000 bzip2 benchmark, achieving speedup 3.74 compression part global 2.45 quad processor system.

参考文章(16)
Devang Shah, Steve Kleiman, Bart Smaalders, Programming with threads ,(1996)
Alexander V. Veidenbaum, Sergey Kozhukhov, Alexandru Nicolau, Xinmin Tian, Hideki Saito, Constantine D. Polychronopoulos, Arun Kejariwal, Utpal Banerjee, Wei Li, Milind Girkar, On the Performance Potential of Different Types of Speculative Thread-Level Parallelism ,(2006)
Nicholas Nethercote, Alan Mycroft, Redux: A Dynamic Dataflow Tracer Electronic Notes in Theoretical Computer Science. ,vol. 89, pp. 149- 170 ,(2003) , 10.1016/S1571-0661(04)81047-8
J.G. Steffan, T.C. Mowry, The potential for using thread-level data speculation to facilitate automatic parallelization high-performance computer architecture. pp. 2- 13 ,(1998) , 10.1109/HPCA.1998.650541
Zhao, Multithreaded dependence graphs for concurrent Java programs 1999 Proceedings International Symposium on Software Engineering for Parallel and Distributed Systems. pp. 13- 23 ,(1999) , 10.1109/PDSE.1999.779735
Dean M. Tullsen, John S. Seng, Storageless value prediction using prior register values international symposium on computer architecture. ,vol. 27, pp. 270- 279 ,(1999) , 10.1145/300979.301002
B. Flachs, S. Asano, S.H. Dhong, H.P. Hofstee, G. Gervais, R. Kim, T. Le, P. Liu, J. Leenstra, J. Liberty, B. Michael, H.-J. Oh, S.M. Mueller, O. Takahashi, A. Hatakeyama, Y. Watanabe, N. Yano, D.A. Brokenshire, M. Peyravian, V. To, E. Iwata, The Microarchitecture of the Synergistic Processor for a Cell Processor IEEE Journal of Solid-State Circuits. ,vol. 41, pp. 63- 70 ,(2006) , 10.1109/JSSC.2005.859332
Manohar K. Prabhu, Kunle Olukotun, Exposing speculative thread parallelism in SPEC2000 Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '05. pp. 142- 152 ,(2005) , 10.1145/1065944.1065964
Amy W. Lim, Monica S. Lam, Maximizing parallelism and minimizing synchronization with affine transforms symposium on principles of programming languages. pp. 201- 214 ,(1997) , 10.1145/263699.263719
Arun Kejariwal, Constantine D. Polychronopoulos, Xinmin Tian, Wei Li, Milind Girkar, Sergey Kozhukhov, Hideki Saito, Utpal Banerjee, Alexandru Nicolau, Alexander V. Veidenbaum, On the performance potential of different types of speculative thread-level parallelism Proceedings of the 20th annual international conference on Supercomputing - ICS '06. pp. 24- ,(2006) , 10.1145/1183401.1183407