针对当前Hadoop集群固有的任务级调度分配方法在运行中存在的负载分布不均的现象,着重对集群节点的执行能力进行了分析与研究.提出了一种基于节点能力的任务自适应调度分配方法.该方法根据节点历史和当前的负载状态,以节点性能、任务特征、节点失效率等作为节点任务量调度分配的依据,并使各节点能自适应地对运行的任务量进行调整.实验结果表明集群的总任务完成时间明显地缩减,各节点的负载更加均衡,节点资源的利用更为合理.
Scheduling algorithm has played an important role in improving Hadoop cluster performance. However, the phenomenon of uneven load distribution exists in current Hadoop inherent task-level scheduling methods. In order to maintain the load status of each node in the cluster, we focus on the analysis and research about the implementation capacity of cluster nodes. An adaptive tasks scheduling method based on the node capability is proposed in this paper. According to the node history and the current load status, the method takes node performance, task characteristics and node failure rate as the parameters to calculate node executive ability. On this basis, the different amount of tasks are assigned to cluster nodes. Thus, joined nodes adaptively adjust the amount of running tasks to make the node suited for different tasks better. Finally experimental results indicate that the proposed adaptive tasks scheduling method can make the total task completion time being reduced significantly, and moreover, the load on each node gets more balanced and the node resource utilization is more reasonable.