作者: Satish Narayanasamy , Brad Calder
DOI:
关键词:
摘要: The processor industry is at an inflection point. In the past, performance was driving force behind industry. But in coming many-core era, improving programmability and reliability of system will be least as important raw performance. To meet this vision, thesis presents a feature that assists programmers understanding software failures. Reproducing failures significant challenge. problem severe especially for multi-threaded programs because causes failure can non-deterministic nature. proposed continuously logs program's execution while sacrificing very little (1%). If program crashes, developer use log to debug by deterministically replaying every single instruction executed part failed execution. Two key mechanisms enable deterministic replay feature. One BugNet, checkpointing technique, which all input thread logging values load instructions. other Strata, primitive recording shared-memory dependencies snoop-based or directory-based multi-processor. former sufficient uni-processor systems later required multi-processor systems. As proof-of-concept, implementation BugNet replayer built using Pin instrumentation tool. To understand space requirements recorder debugging, empirically quantifies how much need logged replayed order root cause majority bugs. Finally, demonstrate utility feature, tool finds data race bugs automatically prioritizes them. detection collaboration with Microsoft. It has been used find fix production code, including Windows Vista Internet Explorer.