作者: Ibrahim Adel Ibrahim , Mostafa Bassiouni
DOI: 10.1186/S13677-019-0139-6
关键词:
摘要: Task stragglers in MapReduce jobs dramatically impede job execution of data-intensive computing cloud data centers. This impedance is due to the uneven distribution input data, heterogeneous nodes, resource contention situations, and network configurations. Data skew intermediate causes delay failures violation completion time. Data-intensive frameworks, such as or Hadoop YARN, employ HashPartitioner. partitioner may cause skew, which results straggler reducers. In this paper, we strive make YARN more efficient environments. We present, a new partitioning scheme, called balanced clusters (BDCP), handle Reduce tasks based on sampling feedback information about current processing task. Our extensive experimental show that BDCP can outperform default HashPartitioner Range partitioner. assist mitigation during reduce phase minimize time within computing.