云计算环境下面向流程的数据密集型应用已被广泛应用于多个领域.面对多数据中心的云计算环境,这类应用在数据布局方面遇到了新的挑战,主要表现在如何减少跨数据中心的数据传输、如何保持数据间的依赖性以及如何在提高效率的同时兼顾全局的负载均衡等.针对这些挑战,文中提出一种三阶段数据布局策略,分别针对跨数据中心数据传输、数据依赖关系和全局负载均衡三个目标对数据布局方案进行求解和优化.实验显示,文中提出的数据布局策略具有良好的综合性能,特别是在降低流程执行过程中由跨数据中心数据传输所导致的时间开销方面,效果尤为明显.
With the development of information technology,data-intensive applications in cloud have been used in more and more fields.Because of the decentralized data centers in cloud,these applications now are facing some new challenges in data placement which mainly include how to reduce the time cost of data movements between data centers,how to deal with the data dependencies,and how to keep a relative load balancing of data centers.This paper proposes a data placement strategy,the three stages of which address the three challenges above respectively.Simulation shows that the strategy can effectively reduce the time cost of data movements across data centers during the application's execution.