针对生物信息学应用中不同实例之间数据不相关、实例各片段间数据部分依赖的特点,将流水线原理融合到工作流中,同一流程的不同实例分级并行执行,流程间实例或片段流水执行,实现了多级流水线效果,进而在实例或片段执行过程中根据资源空闲程度与任务执行状态实施动态多粒度副本创建机制,均衡系统负载,提高资源利用率,有效降低了整个应用流程的执行时间.
To improve the service scheduling performance in heterogeneous bioinformatics grid platform, a new strategy which integrates multi-pipelining and dynamic multi-granularity replica with grid workflow is proposed. Two scheduling algorithms, i.e. MP-gridWF and MP&MR-gridWF, are discussed respectively. The experiment results indicate that significant improvements are achieved in multifarious scenarios and optimal makespan can be gained.