针对传统科学工作流数据布局策略在减少数据传输时间的同时,不能兼顾数据中心间的负载均衡,提出一种基于多目标优化的数据布局策略。首先生成固定数据集布局方案,然后利用多目标优化算法KnEA对非固定数据集进行布局,最终得到全局布局方案。KnEA算法利用kneepoints比普通非支配个体有着更好的收敛性特征,并综合考虑多个优化目标间的平衡,因而可以取得数据传输时间和负载均衡都很好的数据布局方案。通过对比实验证明了该数据布局策略的有效性。
The traditional data placement strategies for scientific workflows fail to monitor the load balancing between data centers while reducing the data transfer time. Thus, a data placement strategy based on multi-objective optimization is proposed. Firstly, the strategy generates the placement scheme of fixed datasets. Then it uses multi-objective optimization-based algorithm KnEA (Knee Point Driven Evolutionary Algorithm) to place flexible datasets, and then obtain the placement scheme of all datasets. The algorithm KnEA takes advantage of characteristic of knee points which can get good convergence comparing to other non-dominated sorting individuals, and comprehensively deals with the balance between multiple objectives. That' s why the data placement strategy is able to perform well in data transferring time and load balancing. The effectiveness of the proposed method is tested by comparison with two other strategies.