对Hadoop平台的作业调度算法进行了研究,提出了支持作业类型区分的多队列调度优化算法。优化算法支持根据节点当前的负载情况分配不同类型的作业,以提高节点的资源利用率;允许作业队列的资源在闲置时被其他作业队列占用;在原作业队列需要时可以被即时回收,即回收过程支持任务抢占;采用共享队列列表和非共享队列列表的逻辑划分来防止乒乓效应。Hadoop平台的性能测试结果表明,优化算法相比系统默认算法在作业调度的执行效率、执行平稳性等方面都有了显著的提升。
This paper discussed some job scheduling algorithms for Hadoop platform, and proposed a novel scheduling optimi- zation algorithm based on multi-queue which provided differentiated services based on the type of jobs. Firstly, it assigned the different type of jobs to the opposite node according to its current load so as to increasing the resource efficiency of nodes. Secondly, it allowed the available resource of job queue to be occupied by the other job queues, meanwhile, the occupied re- sources could be reclaimed by preemptive strategies. Finally, it could prevent the ping-pong effect by the logical division of the shared queue list and the non-shared queue list. The experiments on Hadoop cloud computing platform compared the proposed algorithm with the default native algorithms. Results show that it has a significant improvement in execution efficiency and sta- bility of job scheduling.