针对网格环境中,任务调度的目标系统具有规模庞大、分布异构和动态性等特点,提出一种基于模糊聚类的网格异构任务调度算法.以往的很多调度算法需要在调度的每一步遍历整个目标系统,虽然能够获得较小的makespan,但是无疑增加了整个调度的Runtime.定义了一组刻画处理单元综合性能的特征,利用模糊聚类方法对目标系统(处理单元网络)进行预处理,实现了对处理单元网络的合理划分,使得在任务调度时能够较准确地优先选择综合性能较好的处理单元聚类,从而缩小搜索空间,大量减少任务调度时选择处理单元的时间耗费,此外,就绪任务优先级的构造既隐含考虑了关键路径上节点的执行情况对整个程序执行的影响,又考虑了异构资源对任务执行的影响.实验及性能分析比较的结果表明,定义的处理器特征能够实现对处理器网络的合理划分,而且随着目标系统规模的增大,所提出的算法优越性越来越明显。
Focusing on the problem of task scheduling under large-scale, heterogeneous and dynamic environments in grid computing, a heuristic algorithm based on fuzzy clustering is presented. Many previous scheduling algorithms need to search and compare every processing cell in the target system in order to choose a suitable one for a task. Though those methods can get an approving Make-span, undoubtedly, it would increase the entire runtime. A group of features, which describe the synthetic performance of processing cells in the target system, are defined in this paper. With these defined features, the target system, also called processing cell network, is pretreated by fuzzy clustering method in order to realize the reasonable clustering of processor network. In the scheduling stage, the cluster with better synthetic performance will be chosen first. There is no need to search every processing cell in the target system at every scheduling step. Therefore, it largely reduces the cost on choosing which processing cell to execute the current task. The design of the ready task's priority considers not only the influence that comes from the executing of nodes on critical path, but also the influence induced by heterogeneous resource, on which the task will be scheduled. In the last part, the algorithm's performance is analyzed and compared with other algorithms, and the test results show that the bigger the target system is, the better performance the algorithm shows.