Chip multi-processor scalability for single-threaded applications

作者: Neil Vachharajani , Matthew Iyer , Chinmay Ashok , Manish Vachharajani , David I. August

DOI: 10.1145/1105734.1105741

关键词: Uniprocessor systemParallelism (grammar)ScalabilityData parallelismTask parallelismInstruction-level parallelismComputer scienceEmbedded systemComputer architecture

摘要: The exponential increase in uniprocessor performance has begun to slow. Designers have been unable scale while managing thermal, power, and electrical effects. Furthermore, design complexity limits the size of monolithic processors that can be designed keeping costs reasonable. Industry responded by moving toward chip multi-processor architectures (CMP). These are composed from replicated utilizing die area afforded newer processes. While this approach mitigates issues with complexity, effects, it does nothing directly improve contemporary or future single-threaded applications.This paper examines scalability potential for exploiting parallelism applications on these CMP platforms. explores total available unmodified sequential then viability machines. Using results analysis, forecasts CMPs, using "intrinsic" a program, sustain improvement users come expect new only 6-8 years provided many successful parallelization efforts emerge. Given outlook, advocates exploring methodologies which achieve beyond limit programs.

参考文章(35)
A. Gandhi, H. Akkary, S.T. Srinivasan, Reducing branch misprediction penalty via selective branch recovery high-performance computer architecture. pp. 254- 264 ,(2004) , 10.1109/HPCA.2004.10004
Troy A. Johnson, Rudolf Eigenmann, T. N. Vijaykumar, Min-cut program decomposition for thread-level speculation Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation - PLDI '04. ,vol. 39, pp. 59- 70 ,(2004) , 10.1145/996841.996851
Matthew A. Postiff, David A. Greene, Gary S. Tyson, Trevor N. Mudge, The limits of instruction level parallelism in SPEC95 applications ACM Sigarch Computer Architecture News. ,vol. 27, pp. 31- 34 ,(1999) , 10.1145/309758.309771
Neil Vachharajani, Ram Rangan, David I. August, Manish Vachharajani, Decoupled Software Pipelining with the Synchronization Array international conference on parallel architectures and compilation techniques. pp. 177- 188 ,(2004) , 10.5555/1025127.1026007
Pedro Marcuello, Antonio González, Clustered speculative multithreaded processors international conference on supercomputing. pp. 365- 372 ,(1999) , 10.1145/305138.305214
Manohar K. Prabhu, Kunle Olukotun, Exposing speculative thread parallelism in SPEC2000 Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '05. pp. 142- 152 ,(2005) , 10.1145/1065944.1065964
Yuan Chou, Jason Fung, John Paul Shen, Reducing branch misprediction penalties via dynamic control independence detection international conference on supercomputing. pp. 109- 118 ,(1999) , 10.1145/305138.305175
L. Hammond, B.A. Hubbert, M. Siu, M.K. Prabhu, M. Chen, K. Olukolun, The Stanford Hydra CMP IEEE Micro. ,vol. 20, pp. 71- 84 ,(2000) , 10.1109/40.848474
Marco Galluzzi, Valentín Puente, Adrián Cristal, Ramón Beivide, José-Ángel Gregorio, Mateo Valero, A first glance at Kilo-instruction based multiprocessors Proceedings of the first conference on computing frontiers on Computing frontiers - CF'04. pp. 212- 221 ,(2004) , 10.1145/977091.977120
Hongzhang Shan, Jaswinder Pal Singh, A comparison of MPI, SHMEM and cache-coherent shared address space programming models on the SGI Origin2000 international conference on supercomputing. pp. 329- 338 ,(1999) , 10.1145/305138.305210