Extending the scope of the controlled logical clock

作者: Daniel Becker , Markus Geimer , Rolf Rabenseifner , Felix Wolf

DOI: 10.1007/S10586-011-0181-8

关键词: Distributed computingSynchronization (computer science)Logical clockComputer scienceEvent (computing)Scope (computer science)Power (physics)Semantics (computer science)Synchronization

摘要: Event traces are helpful in understanding the performance behavior of parallel applications since they allow in-depth analysis communication and synchronization patterns. However, absence synchronized clocks on most cluster systems may render ineffective because inaccurate relative event timings misrepresent logical order lead to errors when quantifying impact certain behaviors or confuse users time-line visualization tools by showing messages flowing backward time. In our earlier work, we have developed a scalable algorithm called controlled clock that eliminates inconsistent inter-process postmortem pure MPI applications, potentially running large processor configurations. this paper, first demonstrate also proves beneficial computational grids, where single application is executed using combined power several geographically dispersed clusters. Second, present an extended version that--in addition message-passing semantics--also preserves restores shared-memory semantics, enabling correction from hybrid applications.

参考文章(50)
Andrzej Duda, Yoram Haddad, Gilbert Harrus, Guy Bernard, Estimating Global Time in Distributed Systems. international conference on distributed computing systems. pp. 299- 306 ,(1987)
Ozalp Babaoglu, Rogerio Drummond, Almost) No Cost Clock Synchronization Cornell University. ,(1986)
G. J. W. van Dijk, A. J. van der Wal, Partial ordering of synchronization events for distributed debugging in tightly-coupled multiprocessor systems EDMCC2 Proceedings of the 2nd Euronean Conference on Distributed Memory Computing. pp. 100- 109 ,(1991) , 10.1007/BFB0032927
Daniel Lorenz, Bernd Mohr, Christian Rössel, Dirk Schmidl, Felix Wolf, How to reconcile event-based performance analysis with tasking in OpenMP international workshop on openmp. pp. 109- 121 ,(2010) , 10.1007/978-3-642-13217-9_9
Adam K. L. Wong, Andrzej M. Goscinski, Using an Enterprise Grid for Execution of MPI Parallel Applications – A Case Study Recent Advances in Parallel Virtual Machine and Message Passing Interface. ,vol. 4192, pp. 194- 201 ,(2006) , 10.1007/11846802_31
Markus Geimer, Felix Wolf, Andreas Knüpfer, Bernd Mohr, Brian J. N. Wylie, A parallel trace-data interface for scalable performance analysis parallel computing. pp. 398- 408 ,(2006) , 10.1007/978-3-540-75755-9_49
Jon MacLaren, HARC: the highly-available resource co-allocator international conference on move to meaningful internet systems. pp. 1385- 1402 ,(2007) , 10.1007/978-3-540-76843-2_18
Ian Foster, Globus toolkit version 4: software for service-oriented systems network and parallel computing. ,vol. 21, pp. 2- 13 ,(2005) , 10.1007/11577188_2
Jean-Marc Jezequel, Building a Global Time on Parallel Machines international workshop on distributed algorithms. pp. 136- 147 ,(1989) , 10.1007/3-540-51687-5_38