A Data Placement Strategy for Big Data Based on DCC in Cloud Computing Systems

作者: Tao Wang , Shihong Yao , Zhengquan Xu , Shan Jia , Qiang Xu

DOI: 10.1109/SMARTCITY.2015.139

关键词: Cloud computingComputational complexity theoryBig dataDistributed data storeDynamic priority schedulingComputer scienceDistributed databaseDistributed computingFair-share schedulingData centerData mining

摘要: In complex and data-intensive applications, data scheduling between centers must occur when multiple datasets stored in distributed are processed by one computation. To store massive effectively reduce during the execution of computations, a mathematical model cloud computing is built dynamic computation correlation (DCC) defined. Then placement strategy for big based on DCC proposed. Datasets with high placed into same center, new dynamically most appropriate center. Comprehensive experiments show that proposed can number has considerably low almost constant computational complexity increases massive. It be expected will applicable to practical large-scale storage systems management.

参考文章(17)
André Brinkmann, Sascha Effert, Friedhelm Meyer auf der Heide, Christian Scheideler, Dynamic and Redundant Data Placement international conference on distributed computing systems. ,vol. 1, pp. 29- 29 ,(2007) , 10.1109/ICDCS.2007.103
Dong Yuan, Yun Yang, Xiao Liu, Jinjun Chen, A data placement strategy in scientific cloud workflows Future Generation Computer Systems. ,vol. 26, pp. 1200- 1214 ,(2010) , 10.1016/J.FUTURE.2010.02.004
Chia-Wei Lee, Kuang-Yu Hsieh, Sun-Yuan Hsieh, Hung-Chang Hsiao, A Dynamic Data Placement Strategy for Hadoop in Heterogeneous Environments Big Data Research. ,vol. 1, pp. 14- 22 ,(2014) , 10.1016/J.BDR.2014.07.002
Siva Theja Maguluri, R. Srikant, Lei Ying, Stochastic models of load balancing and scheduling in cloud computing clusters international conference on computer communications. pp. 702- 710 ,(2012) , 10.1109/INFCOM.2012.6195815
Divyakant Agrawal, Sudipto Das, Amr El Abbadi, Big data and cloud computing Proceedings of the 14th International Conference on Extending Database Technology - EDBT/ICDT '11. pp. 530- 533 ,(2011) , 10.1145/1951365.1951432
Wenfei Fan, Dependencies revisited for improving data quality Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems - PODS '08. pp. 159- 170 ,(2008) , 10.1145/1376916.1376940
Michael Batty, Big data, smart cities and city planning: Dialogues in human geography. ,vol. 3, pp. 274- 279 ,(2013) , 10.1177/2043820613513390
Tevfik Kosar, Miron Livny, A framework for reliable and efficient data placement in distributed computing systems grid computing. ,vol. 65, pp. 1146- 1157 ,(2005) , 10.1016/J.JPDC.2005.04.019
Jinchuan Chen, Yueguo Chen, Xiaoyong Du, Cuiping Li, Jiaheng Lu, Suyun Zhao, Xuan Zhou, Big data challenge: a data management perspective Frontiers of Computer Science. ,vol. 7, pp. 157- 164 ,(2013) , 10.1007/S11704-013-3903-7
Geert Monsieur, Monique Snoeck, Wilfried Lemahieu, Managing data dependencies in service compositions Journal of Systems and Software. ,vol. 85, pp. 2604- 2628 ,(2012) , 10.1016/J.JSS.2012.05.092