为解决异构分布式环境下采用主副版本策略的可靠性调度问题,提出一种基于优先级约束的可靠性代价和Makespan(调度时长)驱动的分布式容错调度算法DRCAMD。该算法可在满足系统可调度性的前提下,以异构分布式环境的节点、通信链路的可靠性与Makespan做为可调节局部目标函数,实现具有较高可靠性及较短执行时间的容错调度策略,避免将任务分配到失效率较高的节点上执行。另外,算法的副版本采用被动和主副重叠方式执行,使得容错调度算法具有较大的灵活性。仿真实验表明,该算法性能优于现有容错算法。
To solve the reliability scheduling problem of primary-backup in heterogeneous distributed computing systems, the paper puts forward the DRCAMD, a fault-tolerant scheduling algorithm for distributed systems based on priority constraints of reliability-cost and Makespan ( the schedule length) driven. Under the premise of meeting schedulability, the algorithm realizes a higher reliability and shorter execution time fault-tolerant scheduling strategy with the heterogeneous distributed environment nodes, at the same time it can avoid allocating the tasks to the nodes of higher failure rate for execution. In addition, the algorithm of minor version can execute in passive and overlap between main and side, making the fault-tolerant scheduling implemented with the greater flexibility. And the simulation result shows that the DRCAMD outperforms the exiting fault-tolerant scheduling algorithms.