作者: Andrea Bondavalli , Andrea Ceccarelli , Francesco Brancati , Diego Santoro , Michele Vadursi
DOI: 10.1016/J.MEASUREMENT.2015.11.010
关键词: Reliability (computer networking) 、 Anomaly detection 、 Real-time computing 、 Fault tolerance 、 Operating system 、 Fault detection and isolation 、 System monitoring 、 Dependability 、 Air traffic management 、 Engineering 、 Identification (information)
摘要: Abstract Dependable complex systems often operate under variable and non-stationary conditions, which requires efficient extensive monitoring error detection solutions. Among the many, paper focuses on anomaly techniques, monitor evolution of some specific indicators through time to identify anomalies, i.e. deviations from expected operational behavior. The timely identification anomalies in dependable, fault tolerant allows detect errors services react appropriately. In this paper, we investigate possibility using random walk model belonging Operating Systems, specifically our study Linux Red Hat EL5. approach is based experimental evaluation a large set heterogeneous indicators, are acquired different operating both terms workload faultload, an air traffic management target system. statistical analysis best-fitting aiming minimize integral distance between empirical data distribution reference distributions. outcomes show that idea adopting for development critical operates at System level promising. Moreover, standard distributions such as Laplace Cauchy, rather than Normal, should be used setting up thresholds monitor. Further studies involve new application, layer (an Application Server) will allow verifying generalization other systems, monitored layers indicators.