作者: Christopher Weaver , Joel Emer , Shubhendu S. Mukherjee , Steven K. Reinhardt
关键词:
摘要: Transient faults due to neutron and alpha particle strikes posea significant obstacle increasing processor transistor counts infuture technologies. Although fault rates of individual transistorsmay not rise significantly, incorporating more transistors into adevice makes that device likely encounter a fault. Hence,maintaining error at acceptable levels will requireincreasing design effort.This paper proposes two simple approaches reduce errorrates evaluates their application microprocessor instructionqueue. The first technique reduces the time instructions sit invulnerable storage structures by selectively squashing instructionswhen long delays are encountered. A is less cause anerror if structure it affects does contain valid instructions.We introduce new metric, MITF (Mean Instructions To Failure),to capture trade-off between performance reliability introducedby this approach.The second addresses false detected errors. In theabsence detection mechanism, such errors would nothave affected final outcome program. For example, faultaffecting result dynamically dead instruction notchange program output, but could still be flagged thehardware as an error. avoid signalling errors, wemodify pipeline's logic mark instructionsand data possibly incorrect rather than immediately signalingan Then, we signal only determine laterthat value have program'soutput.