作者: Atul Singh , Petros Maniatis , Timothy Roscoe , Peter Druschel
关键词:
摘要: Distributed systems are hard to build, profile, debug, and test. Monitoring a distributed system - detect analyze bugs, test for regressions, identify fault-tolerance problems or security compromises can be difficult error-prone. In this paper we argue that declarative development of is well suited tackle these tasks. We present an application logging, monitoring, debugging facility have built on top the P2 system, comprising introspection model, execution tracing component, query processor. use demonstrate range on-line diagnosis tools from simple, local state assertions sophisticated global property detectors consistent snapshots. These small, deployed piecemeal at any point during system's life cycle. Our evaluation suggests overhead our approach improving monitoring running continuously in tune with its benefits.