Distributed Anomaly Detection and Prevention for Virtual Platforms

作者: Ali Imran Jehangiri

DOI:

关键词:

摘要: An increasing number of applications are being hosted on cloud based platforms. Cloud platforms serving as a general computing facility and these range from simple multi-tier web to complex social networking, eCommerce Big Data applications. High availability, performance auto-scaling key requirements serve using dynamic provisioning resources in on-demand, multi-tenant fashion. A challenge for service providers is ensure the Quality Service (QoS), user / customer requires more explicit guarantees QoS services. problems can directly lead extensive financial loses. Thus, control verification become vital concern any production level deployment. Therefore, it crucial address managed objective. The success services depends critically automated problem diagnostics predictive analytics enabling organizations manage their proactively. Moreover, effective advance monitoring equally important management support clouds. In this thesis, we explore techniques developing systems achieve robust systems. At first, two case studies presented motivation need scalable framework. It includes study issues software service, which virtualized platform. second study, analyzed that offered by large IT provider. A generalization forms basis requirement specifications used state-of-the-art analysis. Although, some solutions particular challenges have already been provided, approach diagnosis prediction still missing. For addressing issue, distributed framework first part thesis. We conducted thorough analysis technologies be our makes use existing technologies. However, develop custom collectors retrieve data non-intrusively different layers cloud. addition, subscriber publisher components related events APIs sends alerts SLA Management component taking corrective measures. Further, implemented an Open Computing Interface (OCCI) extension OCCI Mixin mechanism.  To deal with diagnosis, novel parallel anomaly detection presented. First all anomalous metrics found database time-series window. comparative three light-weight statistical selected. extend work MapReduce paradigm assess compare methods terms precision, recall, execution time, speedup scale up. Next, correlate target SLO order locate suspicious metrics. evaluated encompassing Infrastructure (IaaS) Platform (PaaS) models. Experimental results confirm efficient capturing causing anomalies. Finally, present design implementation online system infrastructures. further experimental evaluation set aim at predicting upcoming periods high utilization or poor enough time enable appropriate scheduling, scaling, migration virtual resources. Using real sets gathered university center, several approaches ranging (e.g. auto regression (AR)) classification Bayesian classifier). observe linear models, especially AR most likely suitable model measures forecast future values. models integrated Machine Learning (ML) improve proactive management.

参考文章(75)
Richard Mortier, Rebecca Isaacs, Dushyanth Narayanan, Paul Barham, Magpie: online modelling and performance-aware systems hot topics in operating systems. pp. 15- 15 ,(2003)
Edwin Yaqub, Ramin Yahyapour, Philipp Wieder, Kuan Lu, A protocol development framework for SLA negotiations in cloud and service computing grid economics and business models. pp. 1- 15 ,(2012) , 10.1007/978-3-642-35194-5_1
Murray Stokely, Farzan Rohani, Eric Tassone, Large-Scale Parallel Statistical Forecasting Computations in R ,(2011)
Ajith H. Ranabahu, Amit P. Sheth, Pankesh Patel, Service Level Agreement in Cloud Computing ,(2009)
Manoj K. Agarwal, Karen Appleby, Manish Gupta, Gautam Kar, Anindya Neogi, Anca Sailer, Problem Determination Using Dependency Graphs and Run-Time Behavior Models distributed systems operations and management. pp. 171- 182 ,(2004) , 10.1007/978-3-540-30184-4_15
Íñigo Goiri, Ferran Julià, J. Oriol Fitó, Mario Macías, Jordi Guitart, Resource-Level QoS Metric for CPU-Based Guarantees in Cloud Providers Economics of Grids, Clouds, Systems, and Services. pp. 34- 47 ,(2010) , 10.1007/978-3-642-15681-6_3
Terence Kelly, Ira Cohen, Julie Symons, Jeffrey S. Chase, Moises Goldszmidt, Correlating instrumentation data to system states: a building block for automated diagnosis and control operating systems design and implementation. pp. 16- 16 ,(2004)
Stefano Ferretti, Vittorio Ghini, Fabio Panzieri, Michele Pellegrini, Elisa Turrini, QoS-Aware Clouds international conference on cloud computing. pp. 321- 328 ,(2010) , 10.1109/CLOUD.2010.17