Diagnosing distributed systems with self-propelled instrumentation

作者: Barton P. Miller , Alexander V. Mirgorodskiy

DOI: 10.5555/1496950.1496957

关键词:

摘要: We present a three-part approach for diagnosing bugs and performance problems in production distributed environments. First, we introduce novel execution monitoring technique that dynamically injects fragment of code, the agent, into an application process on demand. The agent inserts instrumentation ahead control flow within propagates other processes, following communication events, crossing host boundaries, collecting function-level trace execution. Second, algorithm separates user-meaningful activities called flows. This step simplifies manual examination enables automated analysis trace. Finally, describe our root cause compares flows to help analyst locate anomalous identify function is likely anomaly. demonstrate effectiveness techniques by two complex Condor scheduling system.

参考文章(44)
Anupam Chanda, Alan L. Cox, Khaled Elmeleegy, Willy Zwaenepoel, Causeway: Support for Controlling and Analyzing the Execution of Web-Accessible Applications international middleware conference. ,(2005)
Richard Mortier, Rebecca Isaacs, Dushyanth Narayanan, Paul Barham, Magpie: online modelling and performance-aware systems hot topics in operating systems. pp. 15- 15 ,(2003)
Nicholas Nethercote, Julian Seward, Valgrind: A Program Supervision Framework Electronic Notes in Theoretical Computer Science. ,vol. 89, pp. 44- 66 ,(2003) , 10.1016/S1571-0661(04)81042-9
Richard Mortier, Rebecca Isaacs, Austin Donnelly, Paul Barham, Using magpie for request extraction and workload modelling operating systems design and implementation. pp. 18- 18 ,(2004)
Eric Brewer, Emre Kiciman, Mike Y. Chen, Armando Fox, Anthony Accardi, Jim Lloyd, Dave Patterson, Path-based faliure and evolution management networked systems design and implementation. pp. 23- 23 ,(2004)
Jack Davidson, Kevin Scott, Strata: A Software Dynamic Translation Infrastructure University of Virginia. ,(2001)
Saman P. Amarasinghe, Evelyn Duesterwald, Derek L. Bruening, Design and implementation of a dynamic optimization framework for windows ,(2000)
B.P. Miller, DPM: a measurement system for distributed programs IEEE Transactions on Computers. ,vol. 37, pp. 243- 248 ,(1988) , 10.1109/12.2157
Alexander V. Mirgorodskiy, Barton P. Miller, Autonomous analysis of interactive systems with self-propelled instrumentation conference on multimedia computing and networking. ,vol. 5680, pp. 188- 202 ,(2005) , 10.1117/12.592738