作者: Ana Gainaru , Franck Cappello , Joshi Fullop , Stefan Trausan-Matu , William Kramer
关键词:
摘要: In this paper, we analyse messages generated by different HPC large-scale systems in order to extract sequences of correlated events which lately use predict the normal and faulty behaviour system. Our method uses a dynamic window strategy that is able find frequent regardless on time delay between them. Most current related research narrows correlation extraction fixed relatively small windows do not reflect whole The are constant change during lifetime machine. We consider it important update at runtime applying modifications after each prediction phase according forecast's accuracy difference what was expected really happened. experiments show our analysing system around 60% with precision 85% lower event granularity than before.