Reliable State Monitoring in Cloud Datacenters

作者: Shicong Meng , Arun K. Iyengar , Isabelle M. Rouvellou , Ling Liu , Kisung Lee

DOI: 10.1109/CLOUD.2012.10

关键词:

摘要: State monitoring is widely used for detecting critical events and abnormalities of distributed systems. As the scale such systems grows degree workload consolidation increases in Cloud data centers, node failures performance interferences, especially transient ones, become norm rather than exception. Hence, state tasks are often exposed to impaired communication caused by dynamics on different nodes. Unfortunately, existing approaches designed under assumption always-online nodes reliable inter-node communication. a result, these produce misleading results which turn introduce various problems users who rely perform automatic management as auto-scaling. This paper introduces new approach that tackles this challenge exposing handling message delay loss environments. Our delivers two distinct features. First, it quantitatively estimates accuracy capture uncertainties introduced messaging dynamics. feature helps distinguish trustworthy from ones heavily deviated truth, yet significantly improves utility compared with simple techniques invalidate all generated presence Second, our also adapts non-transient issues reconfiguring algorithms minimize errors. experimental show that, even severe delay, consistently accuracy, when applied application auto-scaling, outperforms terms ability correctly trigger dynamic provisioning.

参考文章(19)
Vladimir Zadorozhny, Louiqa Raschid, Avigdor Gal, Hui-Fang Wen, Monitoring the Performance of Wide Area Applications using Latency Profiles. WWW (Posters). ,(2003)
Roberto Perdisci, Guofei Gu, Wenke Lee, Junjie Zhang, BotMiner: clustering analysis of network traffic for protocol- and structure-independent botnet detection usenix security symposium. pp. 139- 154 ,(2008)
Praveen Yalagandula, Mike Dahlin, Dmitry Kit, Navendu Jain, Prince Mahajan, Yin Zhang, Network imprecision: a new consistency metric for scalable monitoring operating systems design and implementation. pp. 87- 102 ,(2008) , 10.5555/1855741.1855748
Amin Vahdat, Diwaker Gupta, Rob Gardner, Ludmila Cherkasova, Enforcing performance isolation across virtual machines in Xen acm ifip usenix international conference on middleware. pp. 342- 362 ,(2006) , 10.5555/1515984.1516011
Brian White, Jay Lepreau, Leigh Stoller, Robert Ricci, Shashi Guruprasad, Mac Newbold, Mike Hibler, Chad Barb, Abhijeet Joglekar, An integrated experimental environment for distributed systems and networks ACM SIGOPS Operating Systems Review. ,vol. 36, pp. 255- 270 ,(2002) , 10.1145/844128.844152
M. Dilman, D. Raz, Efficient reactive monitoring IEEE Journal on Selected Areas in Communications. ,vol. 20, pp. 668- 676 ,(2002) , 10.1109/JSAC.2002.1003034
Like Gao, Min Wang, X. Sean Wang, Quality-driven evaluation of trigger conditions on streaming time series Proceedings of the 2005 ACM symposium on Applied computing - SAC '05. pp. 563- 567 ,(2005) , 10.1145/1066677.1066807
Shicong Meng, Ling Liu, Vijayaraghavan Soundararajan, Tide Proceedings of the 11th International Middleware Conference Industrial track on - Middleware Industrial Track '10. pp. 17- 22 ,(2010) , 10.1145/1891719.1891722
Chris Olston, Jing Jiang, Jennifer Widom, Adaptive filters for continuous queries over distributed data streams international conference on management of data. pp. 563- 574 ,(2003) , 10.1145/872757.872825
Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, Alex C. Snoeren, Cloud control with distributed rate limiting acm special interest group on data communication. ,vol. 37, pp. 337- 348 ,(2007) , 10.1145/1282380.1282419