针对异构系统中基于多副本机制的容错调度方法忽略调度makespan、任务间依赖与系统链路失效及严格调度方式调度makespan较长问题,首先提出通用调度方式下同时考虑节点和链路失效的可靠性计算方法;然后给出该通用调度问题的0-1整数规划模型;接着提出可靠性意识多副本任务通用调度(RAMD_TGS,reliability-aware multi-duplication task general scheduling)算法,通过遗传算法种群进化来搜索副本映射节点和开始执行时间。实验表明该算法不仅满足可靠性要求,而且与严格调度方式相比能进一步减小调度makespan,该算法资源占用开销也是可接受的。
The fault-tolerant task scheduling mechanisms based on multi-duplication didn't consider the scheduling makespan, the dependencies between tasks, the failures of the links and the longer scheduling makespan caused by the strict scheduling method in the heterogeneous distributed system. So the reliability calculation method that can involve the processor failures and the link failures was proposed firstly. Then the 0-1 integer linear program was proposed for the general scheduling problem. At last, the RAMD_TGS(reliability-aware multi-duplication task general scheduling) algorithm was proposed to solve the 0-1 integer linear program. The algorithm searched the mapped processor and the start execution time on the mapped processor for the task duplication by the evolution of the genetic algorithm. The experiments show that the RAMD_TGS algorithm can meet the reliability requirements and outperforms the existing scheduling algorithms based on the strict scheduling method in terms of scheduling makespan. The resource usages of the algorithm are also acceptable.