Workload Characteristic Oriented Scheduler for MapReduce

作者: Peng Lu , Young Choon Lee , Chen Wang , Bing Bing Zhou , Junliang Chen

DOI: 10.1109/ICPADS.2012.31

关键词: Fixed-priority pre-emptive schedulingDynamic priority schedulingTwo-level schedulingWorkloadFair-share schedulingComputer scienceDistributed computingScheduling (computing)

摘要: Applications in many areas are increasingly developed and ported using the Map Reduce framework (more specifically, Hadoop) to exploit (data) parallelism. The application scope of has been extended beyond original design goal which was large-scale data processing. This extension inherently makes a need for scheduler explicitly take into account characteristics job two main goals efficient resource use performance improvement. In this paper, we study scheduling strategies effectively deal with different workload characteristicsCPU intensive I/O intensive. We present Workload Characteristic Oriented Scheduler (WCO), strives co-locating tasks possibly jobs complementing usage characteristics. WCO is characterized by its essentially dynamic adaptive decisions information obtained from characteristic estimator. primarily estimated sampling help some static task selection strategies, e.g., Java byte code analysis. Results extensive experiments 11 benchmarks 4-node local cluster 51-node Amazon EC2 show 17% improvement on average terms throughput situation co-existing diverse workloads.

参考文章(17)
Byung-Gon Chun, Gunho Lee, H. Katz, Heterogeneity-aware resource allocation and scheduling in the cloud ieee international conference on cloud computing technology and science. pp. 4- 4 ,(2011) , 10.5555/2170444.2170448
Thomas Sandholm, Kevin Lai, Dynamic Proportional Share Scheduling in Hadoop Job Scheduling Strategies for Parallel Processing. ,vol. 6253, pp. 110- 131 ,(2010) , 10.1007/978-3-642-16505-4_7
E. Albert, P. Arenas, S. Genaim, G. Puebla, D. Zanardini, Cost analysis of java bytecode european symposium on programming. pp. 157- 172 ,(2007) , 10.1007/978-3-540-71316-6_12
Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Ion Stoica, Randy Katz, Improving MapReduce performance in heterogeneous environments operating systems design and implementation. pp. 29- 42 ,(2008) , 10.5555/1855741.1855744
Jian Chen, Lizy Kurian John, Dimitris Kaseridis, Modeling program resource demand using inherent program characteristics measurement and modeling of computer systems. ,vol. 39, pp. 1- 12 ,(2011) , 10.1145/1993744.1993746
J. Cohen, Graph Twiddling in a MapReduce World computational science and engineering. ,vol. 11, pp. 29- 41 ,(2009) , 10.1109/MCSE.2009.120
Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, Andrew Goldberg, Quincy: fair scheduling for distributed computing clusters symposium on operating systems principles. pp. 261- 276 ,(2009) , 10.1145/1629575.1629601
Martin Schoeberl, Rasmus Pedersen, None, WCET analysis for a Java processor Proceedings of the 4th international workshop on Java technologies for real-time and embedded systems - JTRES '06. pp. 202- 211 ,(2006) , 10.1145/1167999.1168033
Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, Scott Shenker, Ion Stoica, Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling european conference on computer systems. pp. 265- 278 ,(2010) , 10.1145/1755913.1755940
Jaliya Ekanayake, Shrideep Pallickara, Geoffrey Fox, MapReduce for Data Intensive Scientific Analyses ieee international conference on escience. pp. 277- 284 ,(2008) , 10.1109/ESCIENCE.2008.59