作者: Aaron Brown
DOI: 10.21236/ADA603322
关键词:
摘要: Motivated by the pressing need for increased dependability in corporate and Internet services perspective that effective recovery can improve as much or more than avoiding failures, we introduce a novel mechanism gives human system operators power of system-wide undo. System-wide undo allows to roll back erroneous changes service's state without losing end-user data updates, make retroactive repairs historical timeline service system, thereby quickly recover from catastrophic corruption, operator error, failed upgrades, external attacks, even when root cause catastrophe is unknown. We explore via framework based on concept spheres undo, bubbles time provide scope recoverable serve structuring tool implementing standalone services, hierarchically-composed systems, distributed interacting services. Crucially, allow us define paradoxes, inconsistencies occur an process retroactively alters has been exposed outside its containing sphere Managing paradoxes grand challenge tackle it automatically detects compensates paradoxes; our approach exploits relaxed consistency semantics already present existing interact with end-users. describe implementation We applicability assembling evaluating prototype undoable e-mail store service, analyzing what would be necessary construct online auction developing set guidelines help designers retrofit their find functionality imposes non-negligible but tolerable overhead terms both space. Using methodology develop benchmark human-assisted processes, also undo-based net positive effect dependability, providing significant improvements correctness while only slightly degrading availability.