作者: Kathryn Mohror , Karen L. Karavanic
关键词: Data mining 、 Similarity (geometry) 、 Distributed computing 、 TRACE (psycholinguistics) 、 Scalability 、 Computer science 、 Event (computing) 、 Reduction (complexity) 、 Volume (computing)
摘要: Event traces are required to correctly diagnose a number of performance problems that arise on today's highly parallel systems. Unfortunately, the collection event can produce large volume data is difficult, or even impossible, store and analyze. One approach for compressing trace identify repeating patterns retain only one representative each pattern. However, determining similarity sections traces, i.e., identifying patterns, not straightforward. In this paper, we investigate pattern-based methods reducing will be used analysis. We evaluate different against several criteria, including size reduction, introduced error, retention trends, using both benchmarks with carefully chosen behaviors, real application.