针对分布式数据库中关系及其分片多副本、多站点存储的特性会增加查询搜索空间及时间复杂度,从而降低查询执行计划(QEP)搜索效率的问题,提出一种基于分片分配选择器(FSS)设计准则的并行遗传-最大最小蚁群算法(PGA-MMAS)。首先,结合实际的企业分布式信息管理系统设计FSS,启发式选择较优关系副本,以减少查询连接代价并缩小PGA-MMAS的搜索空间;然后结合遗传算法(GA)收敛较快的优势,对最终连接关系进行编码和并行遗传操作,得到一组相对较优的QEP,并将其转化为并行最大最小蚁群算法(MMAS)的初始信息素分布,从而使其更快速地搜索到全局最优QEP;最后分别在不同关系数情况下对算法进行仿真实验,结果表明,基于FSS的PGA-MMAS搜索最优QEP的效率高于原GA以及基于FFS的GA、MMAS和GA-MMAS;经实际工程应用验证,所提算法搜索出的高质量QEP可以提高分布式数据库多关系查询效率。
Since relation and its fragments in the distributed database have multiple copies and multi-site storage characteristics,which increases the time and space complexity of query and results in lower search efficiency of Query Execution Plan( QEP),a Parallel Genetic Algorithm and Max-Min Ant System( PGA-MMAS) based on design principles of Fragments Site Selector( FSS) was proposed. Firstly,based on the design requirement of the distributed information management system for actual business,FSS was designed,which selected the best one from many copies of relationship heuristically to decrease query join cost and search space of PGA-MMAS. Secondly,Genetic Algorithm( GA) encoded the final join relations and conducted parallel genetic operation to get a set of relative optimal QEPs by taking advantage of quick convergence of GA. Then,QEPs were transformed into the initial pheromone distribution of Max-Min Ant System( MMAS) to obtain the optimal QEP quickly and efficiently. Finally,simulation experiments were conducted under different number of relation conditions,and the results show that PGA-MMAS based on FSS searches optimal QEP more efficiently than original GA,Fragments Site Selector-Genetic Algorithm( FSS-GA),Fragments Site Selector-Max-Min Ant System( FSS-MMAS) and Fragments Site Selector-Genetic Algorithm-Max-Min Ant System( FSS-GA-MMAS). And in the actual engineering application,the proposed algorithm can search high-quality QEP to improve the efficiency of multi-join query in distributed database.