Heuristic attribute reduction and resource-saving algorithm for energy data of data centers

作者: Mincheng Chen , Jingling Yuan , Lin Li , Dongling Liu , Yang He

DOI: 10.1007/S10115-018-1288-5

关键词: Data classificationEnergy managementMissing dataHeuristic (computer science)Data centerData pre-processingAlgorithmComputer scienceSpark (mathematics)Energy consumption

摘要: Energy data, which consist of energy consumption statistics and other related data in green centers, grow dramatically. The have great value, but many attributes within them are redundant unnecessary, they a serious impact on the performance center’s decision-making system. Thus, attribute reduction for has been conceived as critical step. However, existing algorithms often computationally time-consuming. To address these issues, firstly, we extend methodology rough sets to construct center knowledge representation will occur some degree exceptions caused by power failure, instability or factors; hence, design an integrated preprocessing method using Spark mainly includes sampling analysis, classification, missing filling, outlier prediction discretization. By taking good advantage in-memory computing, fast heuristic algorithm (FHARA-S) is proposed. In this algorithm, use efficient transforming decision table, formula measuring significance reduce search space, introduce correlation between condition attribute, further improve computational efficiency. We also adaptive management architecture based FHARA-S, can efficiency strengthen management. experimental results show speed our gains up 2.2X improvement over traditional MapReduce 0.61X Spark. Besides, saves more resources.

参考文章(51)
Yinglong Ma, Xiao Yu, Yuguang Niu, A parallel heuristic reduction based approach for distribution network fault diagnosis International Journal of Electrical Power & Energy Systems. ,vol. 73, pp. 548- 559 ,(2015) , 10.1016/J.IJEPES.2015.05.027
Chen Jason Zhang, Lei Chen, Yongxin Tong, Zheng Liu, Cleaning uncertain data with a noisy crowd international conference on data engineering. pp. 6- 17 ,(2015) , 10.1109/ICDE.2015.7113268
Junbo Zhang, Jian-Syuan Wong, Yi Pan, Tianrui Li, A Parallel Matrix-Based Method for Computing Approximations in Incomplete Information Systems IEEE Transactions on Knowledge and Data Engineering. ,vol. 27, pp. 326- 339 ,(2015) , 10.1109/TKDE.2014.2330821
Indre Zliobaite, Bogdan Gabrys, Adaptive Preprocessing for Streaming Data IEEE Transactions on Knowledge and Data Engineering. ,vol. 26, pp. 309- 321 ,(2014) , 10.1109/TKDE.2012.147
Jiye Liang, Feng Wang, Chuangyin Dang, Yuhua Qian, An efficient rough feature selection algorithm with a multi-granulation view International Journal of Approximate Reasoning. ,vol. 53, pp. 912- 926 ,(2012) , 10.1016/J.IJAR.2012.02.004
Zuhair Khayyat, Ihab F. Ilyas, Alekh Jindal, Samuel Madden, Mourad Ouzzani, Paolo Papotti, Jorge-Arnulfo Quiané-Ruiz, Nan Tang, Si Yin, BigDansing: A System for Big Data Cleansing international conference on management of data. pp. 1215- 1230 ,(2015) , 10.1145/2723372.2747646
Zhengcai Lu, Zheng Qin, Yongqiang Zhang, Jun Fang, A fast feature selection approach based on rough set boundary regions Pattern Recognition Letters. ,vol. 36, pp. 81- 88 ,(2014) , 10.1016/J.PATREC.2013.09.012
Chao Li, Yang Hu, Ruijin Zhou, Ming Liu, Longjun Liu, Jingling Yuan, Tao Li, Enabling datacenter servers to scale out economically and sustainably Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture - MICRO-46. pp. 322- 333 ,(2013) , 10.1145/2540708.2540736
Junbo Zhang, Jian-Syuan Wong, Tianrui Li, Yi Pan, A comparison of parallel large-scale knowledge acquisition using rough set theory on different MapReduce runtime systems International Journal of Approximate Reasoning. ,vol. 55, pp. 896- 907 ,(2014) , 10.1016/J.IJAR.2013.08.003
Feng Jiang, Yuefei Sui, A novel approach for discretization of continuous attributes in rough set theory Knowledge-Based Systems. ,vol. 73, pp. 324- 334 ,(2015) , 10.1016/J.KNOSYS.2014.10.014