Deterministic replay using processor support and its applications

作者: Satish Narayanasamy , Brad Calder

DOI:

关键词:

摘要: The processor industry is at an inflection point. In the past, performance was driving force behind industry. But in coming many-core era, improving programmability and reliability of system will be least as important raw performance. To meet this vision, thesis presents a feature that assists programmers understanding software failures. Reproducing failures significant challenge. problem severe especially for multi-threaded programs because causes failure can non-deterministic nature. proposed continuously logs program's execution while sacrificing very little (1%). If program crashes, developer use log to debug by deterministically replaying every single instruction executed part failed execution. Two key mechanisms enable deterministic replay feature. One BugNet, checkpointing technique, which all input thread logging values load instructions. other Strata, primitive recording shared-memory dependencies snoop-based or directory-based multi-processor. former sufficient uni-processor systems later required multi-processor systems. As proof-of-concept, implementation BugNet replayer built using Pin instrumentation tool. To understand space requirements recorder debugging, empirically quantifies how much need logged replayed order root cause majority bugs. Finally, demonstrate utility feature, tool finds data race bugs automatically prioritizes them. detection collaboration with Microsoft. It has been used find fix production code, including Windows Vista Internet Explorer.

参考文章(95)
Srikanth Kandula, Yuanyuan Zhou, Sudarshan M. Srinivasan, Christopher R. Andrews, Flashback: a lightweight extension for rollback and deterministic replay for software debugging usenix annual technical conference. pp. 3- 3 ,(2004)
Hiroyasu Nishiyama, Detecting data races using dynamic escape analysis based on read barrier VM'04 Proceedings of the 3rd conference on Virtual Machine Research And Technology Symposium - Volume 3. pp. 10- 10 ,(2004)
Julian Seward, Nicholas Nethercote, Using Valgrind to detect undefined value errors with bit-precision usenix annual technical conference. pp. 2- 2 ,(2005)
Nicholas Sterling, WARLOCK - A Static Data Race Analysis Tool. USENIX Winter. pp. 97- 106 ,(1993)
George W. Dunlap, Peter M. Chen, Samuel T. King, Debugging operating systems with time-traveling virtual machines usenix annual technical conference. pp. 1- 1 ,(2005)
Robert L Ashenhurst, Elliott I Organick, Computer system organization : the B5700/B6700 series Academic Press. ,(1973)
Hal Stern, Evan Marcus, Blueprints for high availability ,(2000)
Mariam Kamkar, John Wilander, A Comparison of Publicly Available Tools for Dynamic Buffer Overflow Prevention network and distributed system security symposium. pp. 149- ,(2003)
Richard Stallman, Roland H. Pesch, Debugging with GDB: The GNU Source-Level Debugger ,(1996)
Michiel Ronsse, Koenraad De Bosschere, Frank Cornelis, TORNADO : a novel input replay tool parallel and distributed processing techniques and applications. pp. 1598- 1604 ,(2003)