Building a fault tolerant framework with deadline guarantee in big data stream computing environments

作者: Dawei Sun , Guangyan Zhang , Chengwen Wu , Keqin Li , Weimin Zheng

DOI: 10.1016/J.JCSS.2016.10.010

关键词: Data stream clusteringCritical path methodScheduling (computing)StreamDistributed computingData stream miningData streamFault toleranceGraph (abstract data type)Computer science

摘要: Abstract Big data stream computing systems should work continuously to process streams of on-line data. Therefore, fault tolerance is one the key metrics quality service in big computing. In this paper, we propose a tolerant framework with deadline guarantee for called FTDG. First, FTDG identifies critical path graph at given throughput, and quantifies system reliability graph. Second, allocates tasks by aware heuristic scheduling mechanism. Third, online optimizes task reallocating vertices on lower response time reduce fluctuations. Theoretical as well experimental results demonstrate that makes desirable trade-off between high low objectives environments.

参考文章(43)
Dimitris Berberidis, Vassilis Kekatos, Georgios B. Giannakis, Online Censoring for Large-Scale Regressions with Application to Streaming Big Data IEEE Transactions on Signal Processing. ,vol. 64, pp. 3854- 3867 ,(2016) , 10.1109/TSP.2016.2546225
Leonardo Aniello, Roberto Baldoni, Leonardo Querzoni, Adaptive online scheduling in storm Proceedings of the 7th ACM international conference on Distributed event-based systems - DEBS '13. pp. 207- 218 ,(2013) , 10.1145/2488222.2488267
Ahmed E. Hassan, Zhen Ming Jiang, Weiyi Shang, Hadi Hemmati, Brain Adams, Patrick Martin, Assisting developers of big data analytics applications when deploying on hadoop clouds international conference on software engineering. pp. 402- 411 ,(2013) , 10.5555/2486788.2486842
Amineh Amini, Teh Ying Wah, Hadi Saboohi, On Density-Based Data Streams Clustering Algorithms: A Survey Journal of Computer Science and Technology. ,vol. 29, pp. 116- 141 ,(2014) , 10.1007/S11390-014-1416-Y
Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, Ion Stoica, Discretized streams: fault-tolerant streaming computation at scale symposium on operating systems principles. pp. 423- 438 ,(2013) , 10.1145/2517349.2522737
Mohammad I. Daoud, Nawwaf Kharma, A hybrid heuristic-genetic algorithm for task scheduling in heterogeneous processor networks Journal of Parallel and Distributed Computing. ,vol. 71, pp. 1518- 1531 ,(2011) , 10.1016/J.JPDC.2011.05.005
Zheng Xu, Yunhuai Liu, Lin Mei, Chuanping Hu, Lan Chen, Semantic based representing and organizing surveillance big data using video structural description technology Journal of Systems and Software. ,vol. 102, pp. 217- 225 ,(2015) , 10.1016/J.JSS.2014.07.024
Zhengping Qian, Yong He, Chunzhi Su, Zhuojie Wu, Hongyu Zhu, Taizhi Zhang, Lidong Zhou, Yuan Yu, Zheng Zhang, TimeStream Proceedings of the 8th ACM European Conference on Computer Systems - EuroSys '13. pp. 1- 14 ,(2013) , 10.1145/2465351.2465353
Mauro Andreolini, Michele Colajanni, Marcello Pietri, Stefania Tosi, Adaptive, scalable and reliable monitoring of big data on clouds Journal of Parallel and Distributed Computing. ,vol. 79, pp. 67- 79 ,(2015) , 10.1016/J.JPDC.2014.08.007
Jorge E. Pezoa, Majeed M. Hayat, Performance and Reliability of Non-Markovian Heterogeneous Distributed Computing Systems IEEE Transactions on Parallel and Distributed Systems. ,vol. 23, pp. 1288- 1301 ,(2012) , 10.1109/TPDS.2011.285