Skyline查询是一种非常耗时的操作,而涉及多个表的Skyline查询(Skyline-join查询)则会给数据库系统带来更多的负载,从而影响整个系统的响应时间.为解决这个问题,提出了基于Google设计的MapRe-duce并行处理框架的Skyline-join查询处理算法,采用分片剪枝的方法降低复杂度,进而提高查询性能.在Amazon的云计算平台(EC2)上进行的实验表明,该算法可以有效减少冗余操作和网络数据传输,基本不受节点个数以及数据量的影响,具有很好的可扩展性.
Skyline query is one of the most expensive operators in the database system.Some Skyline queries involving multiple tables,which are called Skyline-join queries,are even more costly to evaluate.Therefore,in this paper,we adopt Google's MapReduce,a parallel processing framework,to handle Skyline-join queries.A novel parallel algorithm is proposed to prune the dataset progressively and hence the network transfer cost is reduced.The algorithm is evaluated on Amazon's EC2 and the experiments verify its efficiency.