Online Checkpointing with Improved Worst-Case Guarantees

作者: Karl Bringmann , Benjamin Doerr , Adrian Neumann , Jakub Sliacan

DOI: 10.1287/IJOC.2014.0639

关键词:

摘要: In the online checkpointing problem, task is to continuously maintain a set of k checkpoints that allow rewinding an ongoing computation faster than by full restart. The only operation allowed replace old checkpoint current state. Our aim placement strategies minimize cost, i.e., such at all times T when requested rewind some time t ≤ number steps need be redone get from before as few possible. particular, we want closest earlier no farther away qk ideal distance T/k + 1, where small constant. Improving work showing 1 1/k 2, show can chosen asymptotically less 2. We present algorithms with asymptotic discrepancy 1.59 o1 valid for and ln4 1.39 being power two. Experiments indicate uniform bound pk 1.7 k. For k, how use linear programming approach compute good algorithms. This gives discrepancies 1.55 < 60. We prove first lower more namely ≥ 1.30-o1. also optimal yielding infimum exist

参考文章(13)
Lauri Ahlroth, Olli Pottonen, André Schumacher, Approximately uniform online checkpointing computing and combinatorics conference. pp. 297- 306 ,(2011) , 10.1007/978-3-642-22685-4_27
Fredrik Österlind, Adam Dunkels, Thiemo Voigt, Nicolas Tsiftes, Joakim Eriksson, Niclas Finne, Sensornet Checkpointing: Enabling Repeatability in Testbeds and Realism in Simulations international conference on embedded wireless systems and networks. pp. 343- 357 ,(2009) , 10.1007/978-3-642-00224-3_22
Vincent Heuveline, Andrea Walther, Online Checkpointing for Parallel Adjoint Computation in PDEs: Application to Goal-Oriented Adaptivity and Flow Control Euro-Par 2006 Parallel Processing. pp. 689- 699 ,(2006) , 10.1007/11823285_72
Philipp Stumm, Andrea Walther, New Algorithms for Optimal Online Checkpointing SIAM Journal on Scientific Computing. ,vol. 32, pp. 836- 854 ,(2010) , 10.1137/080742439
Erol Gelenbe, On the Optimum Checkpoint Interval Journal of the ACM. ,vol. 26, pp. 259- 270 ,(1979) , 10.1145/322123.322131
Sangho Yi, Derrick Kondo, Artur Andrzejak, Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud international conference on cloud computing. pp. 236- 243 ,(2010) , 10.1109/CLOUD.2010.35
M. Bern, D. H. Greene, A. Raghunathan, M. Sudan, Online algorithms for locating checkpoints symposium on the theory of computing. pp. 359- 368 ,(1990) , 10.1145/100216.100264
Marshall Bern, Daniel H. Greene, Arvind Raghunathan, Madhu Sudan, On-line algorithms for locating checkpoints Algorithmica. ,vol. 11, pp. 33- 52 ,(1994) , 10.1007/BF01294262
Sam Toueg, Özalp Babaoglu, On the optimum checkpoint selection problem SIAM Journal on Computing. ,vol. 13, pp. 630- 649 ,(1984) , 10.1137/0213039
Lauri Ahlroth, Olli Pottonen, André Schumacher, Approximately Uniform Online Checkpointing with Bounded Memory Algorithmica. ,vol. 67, pp. 234- 246 ,(2013) , 10.1007/S00453-013-9772-5