Experience mining Google's production console logs

作者: Wei Xu , Ling Huang , Michael Jordan , None

DOI:

关键词:

摘要: We describe our early experience in applying console log mining techniques [19, 20] to logs from production Google systems with thousands of nodes. This data set is five orders magnitude size and contains almost 20 times as many messages types the Hadoop we used [19]. It also has properties that are unique large scale deployments (e.g., system stays on for several months multiple versions software can run concurrently). Our shows techniques, including source code based parsing, state sequence feature creation problem detection, work well this set. discuss using parser assist sanitization.

参考文章(21)
Stephen E. Hansen, E. Todd Atkins, Automated System Monitoring and Notification With Swatch usenix large installation systems administration conference. pp. 145- 152 ,(1993)
Risto Vaarandi, A Breadth-First Algorithm for Mining Frequent Patterns from Event Logs Intelligence in Communication Systems. pp. 293- 308 ,(2004) , 10.1007/978-3-540-30179-0_27
Scott Shenker, George Porter, Ion Stoica, Randy H. Katz, Rodrigo Fonseca, X-trace: a pervasive network tracing framework networked systems design and implementation. pp. 20- 20 ,(2007)
Eric Brewer, Emre Kiciman, Mike Y. Chen, Armando Fox, Anthony Accardi, Jim Lloyd, Dave Patterson, Path-based faliure and evolution management networked systems design and implementation. pp. 23- 23 ,(2004)
R. Vaarandi, A data clustering algorithm for mining patterns from event logs ip operations and management. pp. 119- 126 ,(2003) , 10.1109/IPOM.2003.1251233
Kathleen Fisher, David Walker, Kenny Q. Zhu, Peter White, From dirt to shovels Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL '08. ,vol. 43, pp. 421- 434 ,(2008) , 10.1145/1328438.1328488
Chunyu Kit, Haihua Pan, Hongbiao Chen, Learning case-based knowledge for disambiguating Chinese word segmentation: a preliminary study international conference on computational linguistics. pp. 1- 7 ,(2002) , 10.3115/1118824.1118832
J. Edward Jackson, Govind S. Mudholkar, Control Procedures for Residuals Associated With Principal Component Analysis Technometrics. ,vol. 21, pp. 341- 349 ,(1979) , 10.1080/00401706.1979.10489779
Emanuel Graf, Guido Zgraggen, Peter Sommerlad, Refactoring support for the C++ development tooling Companion to the 22nd ACM SIGPLAN conference on Object oriented programming systems and applications companion - OOPSLA '07. pp. 781- 782 ,(2007) , 10.1145/1297846.1297885