Efficient Hardware Barrier Synchronization in Many-Core CMPs

作者: Jose L. Abellan , Juan Fernandez , Manuel E. Acacio

DOI: 10.1109/TPDS.2011.304

关键词:

摘要: Traditional software-based barrier implementations for shared memory parallel machines tend to produce hotspots in terms of and network contention as the number processors increases. This could limit their applicability future many-core CMPs which possibly several dozens cores would need be synchronized efficiently. In this work, we develop GBarrier, a hardware-based mechanism especially aimed at providing efficient barriers CMPs. Our proposal deploys dedicated G-line-based allow fast signaling arrival departure. Since GBarrier does not have any influence on system, avoid all coherence activity barrier-related traffic that traditional approaches introduce restrict scalability. Through detailed simulations 32-core CMP, compare against one most set kernels scientific applications. Evaluation results show average reductions 54 21 percent execution time, 53 18 traffic, also 76 31 energy-delay2 product metric full CMP when applications, respectively, are considered.

参考文章(31)
Venkata Krishnan, Josep Torrellas, The Need for Fast Communication in Hardware-Based Speculative Chip Multiprocessors International Journal of Parallel Programming. ,vol. 29, pp. 3- 33 ,(2001) , 10.1023/A:1026479803767
John Sartori, Rakesh Kumar, Low-Overhead, High-Speed Multi-core Barrier Synchronization High Performance Embedded Architectures and Compilers. pp. 18- 34 ,(2010) , 10.1007/978-3-642-11515-8_4
Ziang Hu, Juan del Cuvillo, Weirong Zhu, Guang R. Gao, Optimization of Dense Matrix Multiplication on IBM Cyclops-64: Challenges and Experiences Euro-Par 2006 Parallel Processing. pp. 134- 144 ,(2006) , 10.1007/11823285_14
David E. Culler, Jaswinder Pal Singh, Anoop Gupta, Parallel Computer Architecture: A Hardware/Software Approach ,(1998)
H.T. Olnowich, ALLNODE barrier synchronization network international parallel processing symposium. pp. 265- 269 ,(1995) , 10.1109/IPPS.1995.395943
William Tsun-Yuk Hsu, Pen-Chung Yew, None, An effective synchronization network for hot-spot accesses ACM Transactions on Computer Systems. ,vol. 10, pp. 167- 189 ,(1992) , 10.1145/146937.146938
C.J. Hughes, V.S. Pai, P. Ranganathan, S.V. Adve, Rsim: simulating shared-memory multiprocessors with ILP processors IEEE Computer. ,vol. 35, pp. 40- 49 ,(2002) , 10.1109/2.982915
Tushar Krishna, Amit Kumar, Li-Shiuan Peh, Jacob Postman, Patrick Chiang, Mattan Erez, Express Virtual Channels with Capacitively Driven Global Links IEEE Micro. ,vol. 29, pp. 48- 61 ,(2009) , 10.1109/MM.2009.64
Jose L. Abellan, Juan Fernandez, Manuel E. Acacio, A G-Line-Based Network for Fast and Efficient Barrier Synchronization in Many-Core CMPs 2010 39th International Conference on Parallel Processing. pp. 267- 276 ,(2010) , 10.1109/ICPP.2010.34
R.T. Chang, N. Talwalkar, C.P. Yue, S.S. Wong, Near speed-of-light signaling over on-chip electrical interconnects IEEE Journal of Solid-state Circuits. ,vol. 38, pp. 834- 838 ,(2003) , 10.1109/JSSC.2003.810060