作者: Hongmei Yang , Yongquan Liang , Lianshan Liu , Haibin Sun
关键词:
摘要: As the size and complexity of cluster systems grows, failure rates accelerate dramatically. To reduce disaster caused by failures, it is desirable to identify potential failures ahead their occurrence. In this paper, we survey state art in prediction systems. The characteristic are addressed, some statistic results shown. We explore ways collection preprocessing data for prediction, suggest a procedure records automatically generated log files. Focused on main idea five methods, including based threshold, time series analysis, rule-based classification, Bayesian network models semi-Markov process models, analyzed respectively. addition, concerning accuracy practicality, present metrics evaluating techniques compare with metrics.