Can PDES scale in environments with heterogeneous delays

作者: Jingjing Wang , Ketan Bahulkar , Dmitry Ponomarev , Nael Abu-Ghazaleh

DOI: 10.1145/2486092.2486098

关键词: ScalabilityDistributed computingBottleneckComputer scienceParallel computingPollingSoftwareShared memoryDiscrete event simulationLatency (engineering)Chip

摘要: The performance and scalability of Parallel Discrete Event Simulation (PDES) is often limited by communication latencies overheads. emergence multi-core processors their expected evolution into many-cores offers the promise low latency tight memory integration between cores; these properties should significantly improve PDES in such environments. However, on clusters multi-cores (CMs), processing overheads incurred when communicating different machines (nodes) far outweigh those cores same chip, especially commodity networking fabrics software are used. It unclear if there any benefit to among node given that links across nodes worse. In this study, we examine a multi-threaded implementation CMs. We demonstrate inter-node costs impose substantial bottleneck without optimizations addressing long latencies, does not outperform multiprocess version despite direct through shared individual nodes. then propose three optimizations: message consolidation routing, infrequent polling latency-sensitive model partitioning. show with place, threaded outperforms process-based even

参考文章(32)
GirindraD. Sharma, NaelB. Abu-Ghazaleh, UmeshKumarV. Rajasekaran, PhilipA. Wilsey, Optimizing Message Delivery in Asynchronous Distributed Applications european conference on parallel processing. pp. 1204- 1208 ,(1999) , 10.1007/3-540-48311-X_171
K. El-Khatib, C. Tropper, On metrics for the dynamic load balancing of optimistic simulations hawaii international conference on system sciences. pp. 8051- ,(1999) , 10.1109/HICSS.1999.773083
J. Cloutier, E. Cerny, F. Guertin, Model partitioning and the performance of distributed timewarp simulation of logic circuits Simulation Practice and Theory. ,vol. 5, pp. 83- 99 ,(1997) , 10.1016/0928-4869(95)00053-4
Jingjing Wang, Dmitry Ponomarev, Nael Abu-Ghazaleh, Performance Analysis of a Multithreaded PDES Simulator on Multicore Clusters workshop on parallel and distributed simulation. pp. 93- 95 ,(2012) , 10.1109/PADS.2012.33
Jason Liu, Rong Rong, Hierarchical Composite Synchronization workshop on parallel and distributed simulation. pp. 3- 12 ,(2012) , 10.1109/PADS.2012.20
Vivek Sarkar, John Hennessy, Compile-time partitioning and scheduling of parallel programs compiler construction. ,vol. 21, pp. 17- 26 ,(1986) , 10.1145/12276.13313
Roberto Vitali, Alessandro Pellegrini, Francesco Quaglia, Towards Symmetric Multi-threaded Optimistic Simulation Kernels workshop on parallel and distributed simulation. pp. 211- 220 ,(2012) , 10.1109/PADS.2012.46
Patrick Peschlow, Tobias Honecker, Peter Martini, A Flexible Dynamic Partitioning Algorithm for Optimistic Distributed Simulation workshop on parallel and distributed simulation. pp. 219- 228 ,(2007) , 10.1109/PADS.2007.6
Robert Preissl, Nathan Wichmann, Bill Long, John Shalf, Stephane Ethier, Alice Koniges, Multithreaded Global Address Space Communication Techniques for Gyrokinetic Fusion Applications on Ultra-Scale Platforms Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11. pp. 78- ,(2011) , 10.1145/2063384.2071033
Jun Doi, Yasushi Negishi, Overlapping Methods of All-to-All Communication and FFT Algorithms for Torus-Connected Massively Parallel Supercomputers ieee international conference on high performance computing data and analytics. pp. 1- 9 ,(2010) , 10.1109/SC.2010.38