A novel technique for long-term anomaly detection in the cloud

作者: Jordan Hochenbaum , Owen Vallis , Arun Kejariwal

DOI:

关键词: Cloud computingTerm (time)Data miningHigh availabilityTime seriesPiecewiseComputer scienceWeb serviceAnomaly detection

摘要: High availability and performance of a web service is key, amongst other factors, to the overall user experience (which in turn directly impacts bottom-line). Exogenic and/or endogenic factors often give rise anomalies that make maintaining high delivering very challenging. Although there exists large body prior research anomaly detection, existing techniques are not suitable for detecting long-term owing predominant underlying trend component time series data. To this end, we developed novel statistical technique automatically detect cloud data. Specifically, employs learning both application as well system metrics. Further, uses robust metrics, viz., median, median absolute deviation (MAD), piecewise approximation accurately even presence intra-day weekly seasonality. We demonstrate efficacy proposed using production data report Precision, Recall, F-measure measure. Multiple teams at Twitter currently on daily basis.

参考文章(10)
R. B. Cleveland, STL : A Seasonal-Trend Decomposition Procedure Based on Loess Journal of Office Statistics. ,vol. 6, pp. 3- 73 ,(1990)
Bernard Rosner, Percentage Points for a Generalized ESD Many-Outlier Procedure Technometrics. ,vol. 25, pp. 165- 172 ,(1983) , 10.1080/00401706.1983.10487848
Frank R. Hampel, The Influence Curve and Its Role in Robust Estimation Journal of the American Statistical Association. ,vol. 69, pp. 383- 393 ,(1974) , 10.1080/01621459.1974.10482962
Bernard Rosner, On the Detection of Many Outliers Technometrics. ,vol. 17, pp. 221- 227 ,(1975) , 10.1080/00401706.1975.10489305
Elvezio M. Ronchetti, Peter J. Rousseeuw, Werner A. Stahel, Frank R. Hampel, Robust statistics: the approach based on influence functions ,(1986)
Arun Kejariwal, Winston Lee, Owen Vallis, Jordan Hochenbaum, Bryce Yan, Visual Analytics Framework for Cloud Infrastructure Data computational science and engineering. pp. 886- 893 ,(2013) , 10.1109/CSE.2013.133
Winston Lee, Arun Kejariwal, Bryce Yan, Chiffchaff: Observability and analytics to achieve high availability 2013 IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV). pp. 119- 120 ,(2013) , 10.1109/LDAV.2013.6675168
William S. Cleveland, Robust Locally Weighted Regression and Smoothing Scatterplots Journal of the American Statistical Association. ,vol. 74, pp. 829- 836 ,(1979) , 10.1080/01621459.1979.10481038
A. Wald, Maurice G. Kendall, The Advanced Theory of Statistics. Journal of the American Statistical Association. ,vol. 42, pp. 185- ,(1947) , 10.2307/2280203