作者: Yunji Chen , Tianshi Chen , Ling Li , Ruiyang Wu , Daofu Liu
关键词:
摘要: Debugging parallel programs is a well-known difficult problem. A promising method to facilitate debugging using hardware support achieve deterministic replay on Chip Multi-Processor (CMP). As Design-For-Debug (DFD) feature, practical hardware-assisted scheme should have low design and verification costs, as well small log size.To these goals, we propose novel succinct named LReplay. The key innovation of LReplay that instead recording the logical time orders between instructions or instruction blocks previous investigations, built upon pending period information infused by global clock. By recorded information, about 99p execution are inferrable, implying only needs record directly residual 1p noninferrable in production run. can be addressed simple yet cost-effective direction prediction technique, which further reduces size LReplay.Benefiting from preceding innovations, overall over SPLASH-2 benchmarks 0.17B/K-Inst (byte per k-instruction) for sequential consistency, 0.57B/K-Inst Godson-3 consistency. Such sizes smaller an order magnitude than schemes incurring no performance loss. Furthermore, consumes 0.5p area CMP, since it requires trivial modifications existing components Godson-3. features demonstrate potential integrating into future industrial processors.