Dynamic cluster assignment mechanisms

作者: R. Canal , J.M. Parcerisa , A. Gonzalez

DOI: 10.1109/HPCA.2000.824345

关键词:

摘要: Clustered microarchitectures are an effective approach to reducing the penalties caused by wire delays inside a chip. Current superscalar processors have in fact two-cluster microarchitecture with naive code partitioning approach: integer instructions allocated one cluster and floating-point other. This scheme is simple results no communications between two clusters (just through memory) but it general far from optimal because she workload not evenly distributed most of time. In fact, when processor running programs, extremely unbalanced since FP used at all. this work we investigate run-time mechanisms that dynamically distribute program among these clusters. By optimizing trade-off inter-cluster communication penalty balance, proposed schemes can achieve average speed-up 36% for SpecInt95 benchmark suite.

参考文章(20)
Kathryn M. O'Brien, Charles Barton, Pradeep K. Dubey, Kevin O'Brien, Single-program speculative multithreading (SPSM) architecture: compiler-assisted fine-grained multithreading international conference on parallel architectures and compilation techniques. pp. 109- 121 ,(1995) , 10.5555/224659.224701
Doug Burger, Todd M. Austin, Steve Bennett, Evaluating future microprocessors : The SimpleScalar tool set Technical Report CS-TR-96-1308, University of Wisconsin Madison. ,(1996)
Jenn-Yuan Tsai, Pen-Chung Yew, The superthreaded architecture: thread pipelining with run-time data dependence checking and control speculation international conference on parallel architectures and compilation techniques. pp. 35- 46 ,(1996) , 10.1109/PACT.1996.552553
K.I. Farkas, N.P. Jouppi, P. Chow, Register file design considerations in dynamically scheduled processors high-performance computer architecture. pp. 40- 51 ,(1996) , 10.1109/HPCA.1996.501172
M.M. Fernandes, J. Llosa, N. Topham, Distributed modulo scheduling high-performance computer architecture. pp. 130- 134 ,(1999) , 10.1109/HPCA.1999.744349
Joan-Manuel Parcerisa, Ramon Canal, Antonio Gonzalez, A cost-effective clustered architecture international conference on parallel architectures and compilation techniques. pp. 160- 168 ,(1999) , 10.5555/520793.825753
Zvonko Vranesic, Keith I. Farkas, Paul Chow, Norman P. Jouppi, The multicluster architecture: reducing cycle time through partitioning international symposium on microarchitecture. pp. 149- 159 ,(1997) , 10.5555/266800.266815
Haitham Akkary, Michael A. Driscoll, A dynamic multithreading processor international symposium on microarchitecture. pp. 226- 236 ,(1998) , 10.5555/290940.290988
G.A. Kemp, M. Franklin, PEWs: a decentralized dynamic scheduler for ILP processing international conference on parallel processing. ,vol. 1, pp. 239- 246 ,(1996) , 10.1109/ICPP.1996.537165
M.T. Bohr, Interconnect scaling-the real limiter to high performance ULSI international electron devices meeting. pp. 241- 244 ,(1995) , 10.1109/IEDM.1995.499187