作者: Fulu Li , Mohsin Beg
DOI:
关键词:
摘要: A method and apparatus are provided for determining that problems have occurred within a complex multi-host system identifying each problem, sequences of causes effects called fault cause path, starting with root cause. probabilistic model representing the cause/effect relationships among potential identifies probability problem in system. Such failure probabilities may be determined based on aggregating, over recent time interval, values by model. Each path an associated accuracy value reflecting expected relative to other paths. When more than one is identified, number order paths ranked displayed their value.