Probabilistic accuracy bounds for fault-tolerant computations that discard tasks

作者: Martin Rinard

DOI: 10.1145/1183401.1183447

关键词: Statistical modelExecution timeReal-time computingAlgorithmRobustness (computer science)Fault toleranceComputer scienceComputationProbabilistic logic

摘要: We present a new technique for enabling computations to survive errors and faults while providing bound on any resulting output distortion. A developer using the first partitions computation into tasks. The execution platform then simply discards task that encounters an error or fault completes by executing remaining This can substantially improve robustness of in face faults. potential concern is discarding tasks may change result produces.Our randomly samples executions program at varying failure rates obtain quantitative, probabilistic model characterizes distortion as function rates. By bounds distortion, allows users confidently accept results produced with failures long falls within acceptable bounds. approach prove be especially useful successfully hardware distributed computing environments.Our also produces timing time combination models quantifies accuracy/execution tradeoff. It therefore enables development techniques purposefully fail reduce keeping

参考文章(19)
Brian Demsky, Martin Rinard, Data structure repair using goal-directed reasoning international conference on software engineering. pp. 176- 185 ,(2005) , 10.1145/1062455.1062499
Ramon C. Littell, Rudolf Jakob Freund, SAS System for Regression,Third Edition SAS Publishing. ,(2000)
Michael R. Lyu, Software Fault Tolerance John Wiley & Sons, Inc.. ,(1995)
Jim Gray, Andreas Reuter, Transaction Processing: Concepts and Techniques ,(1992)
Martin Rinard, Cristian Cadar, William S. Beebee, Daniel M. Roy, Tudor Leu, Daniel Dumitran, Enhancing server availability and security through failure-oblivious computing operating systems design and implementation. pp. 21- 21 ,(2004)
Jaswinder Pal Singh, Wolf-Dietrich Weber, Anoop Gupta, SPLASH: Stanford parallel applications for shared-memory ACM Sigarch Computer Architecture News. ,vol. 20, pp. 5- 44 ,(1992) , 10.1145/130823.130824
J. M. Harris, S. Lazaratos, R. Michelena, Tomographic string inversion Seg Technical Program Expanded Abstracts. pp. 82- 85 ,(1990) , 10.1190/1.1890353
Angelos D. Keromytis, Stelios Sidiroglou, Using Execution Transactions To Recover From Buffer Overflow Attacks Department of Computer Science, Columbia University. ,(2004) , 10.7916/D8B56WZF
Martin Rinard, Acceptability-oriented computing conference on object-oriented programming systems, languages, and applications. pp. 221- 239 ,(2003) , 10.1145/949344.949402
Ramon C. Littell, Rudolf Jakob Freund, SAS System for regression ,(2000)