Design of scalable PGAS collectives for NUMA and manycore systems

作者： Damián Álvarez Mallón

DOI:

关键词:

摘要: El numero de nucleos por procesador esta creciendo, convirtiendo a los sistemas multinucleo en omnipresentes. Esto implica lidiar con multiples niveles memoria NUMA, accesibles traves complejas jerarquias para procesar las crecientes cantidades datos. La clave un movimiento eficiente y escalable datos es el uso operaciones comunicacion colectivas que minimizen impacto cuellos botella. Usar comunicaciones unilaterales se vuelve mas importante estos sistemas, evitar sincronizaciones entre pares procesos implementadas usando funciones punto bilaterales. Esta tesis propone una serie algoritmos proporcionan buen rendimiento escalabilidad colectivas. Estos usan arboles jerarquicos, solapamento unilaterais, pipelining mensajes afinidad NUMA. Se ha desarrollado implementacion UPC, lenguaje PGAS cuyo tambien sido evaluado tesis. Para comprobar nueva herramienta microbenchmarking fue disenada e implementada. evaluacion algoritmos, realizada 6 representativos, 5 arquitecturas redes interconexion diferentes, mostrado general escalabilidad, mejor lideres MPI muchos casos, lo confirma potencial desarrollados multi- manycore.

暂无可下载资源，当前可以选择系统获取到有开放资源时通知我或者直接发起求助文献求助

参考文章(83)

Kathy Yelick, Luigi Semenzato, Geoff Pike, Carleton Miyamoto, Ben Liblit, Arvind Krishnamurthy, Paul Hilfinger, Susan Graham, David Gay, Phil Colella, Alex Aiken, None, Titanium: a high-performance Java dialect Concurrency and Computation: Practice and Experience. ,vol. 10, pp. 825- 836 ,(1998) , 10.1002/(SICI)1096-9128(199809/11)10:11/13<825::AID-CPE383>3.0.CO;2-H

Nikola Rajovic, Alejandro Rico, Nikola Puzovic, Chris Adeniyi-Jones, Alex Ramirez, Tibidabo: Making the case for an ARM-based HPC system Future Generation Computer Systems. ,vol. 36, pp. 322- 334 ,(2014) , 10.1016/J.FUTURE.2013.07.013

Sabela Ramos, Torsten Hoefler, Modeling communication in cache-coherent SMP systems Proceedings of the 22nd international symposium on High-performance parallel and distributed computing. pp. 97- 108 ,(2013) , 10.1145/2462902.2462916

B.L. Chamberlain, D. Callahan, H.P. Zima, Parallel Programmability and the Chapel Language ieee international conference on high performance computing data and analytics. ,vol. 21, pp. 291- 312 ,(2007) , 10.1177/1094342007078442

Jarek Nieplocha, Bruce Palmer, Vinod Tipparaju, Manojkumar Krishnan, Harold Trease, Edoardo Aprà, Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit ieee international conference on high performance computing data and analytics. ,vol. 20, pp. 203- 231 ,(2006) , 10.1177/1094342006064503

Guillermo L Taboada, Carlos Teijeiro, Juan Tourino, Basilio B Fraguela, Ramón Doallo, José Carlos Mourino, Damián A Mallon, Andrés Gomez, None, Performance Evaluation of Unified Parallel C Collective Communications high performance computing and communications. pp. 69- 78 ,(2009) , 10.1109/HPCC.2009.88

S. Potluri, A. Venkatesh, D. Bureddy, K. Kandalla, D. K. Panda, Efficient intra-node communication on Intel-MIC clusters ieee acm international symposium cluster cloud and grid computing. pp. 128- 135 ,(2013) , 10.1109/CCGRID.2013.86

Rahul Kumar, Amith Mamidala, D. K. Panda, Scaling alltoall collective on multi-core systems international parallel and distributed processing symposium. pp. 1- 8 ,(2008) , 10.1109/IPDPS.2008.4536141

Francois Trahay, Elisabeth Brunet, Alexandre Denis, Raymond Namyst, A multithreaded communication engine for multicore architectures international parallel and distributed processing symposium. pp. 1- 7 ,(2008) , 10.1109/IPDPS.2008.4536139

10.

Weihang Jiang, Liuxing Liu, Hyun-Wook Jin, D.K. Panda, W. Gropp, R. Thakur, High performance MPI-2 one-sided communication over InfiniBand cluster computing and the grid. pp. 531- 538 ,(2004) , 10.1109/CCGRID.2004.1336648

Design of scalable PGAS collectives for NUMA and manycore systems

来源期刊

我的账户

Design of scalable PGAS collectives for NUMA and manycore systems

来源期刊

相似文章 0

我的账户