作者: Wei Xu , Ling Huang , Michael Jordan , None
DOI:
关键词:
摘要: We describe our early experience in applying console log mining techniques [19, 20] to logs from production Google systems with thousands of nodes. This data set is five orders magnitude size and contains almost 20 times as many messages types the Hadoop we used [19]. It also has properties that are unique large scale deployments (e.g., system stays on for several months multiple versions software can run concurrently). Our shows techniques, including source code based parsing, state sequence feature creation problem detection, work well this set. discuss using parser assist sanitization.