针对传统分布式数据库查询应用于分布式空间数据库查询带来的传输和处理代价高的问题,本文结合已有分布式跨边界片段连接优化方法,深入研究了分布式空间拓扑连接查询处理,提出跨边界连接优化的空间查询优化算法,丰富了传统的分布式查询的关系代数等价变换规则。同时,针对不同片段连接类型的分布式空间查询全局优化策略,实现了分布式空间查询分解与数据本地化,从而优化分布式查询中的数据传输所付出的高昂代价。最后,提出了结点归并、连接归并树、执行结点、执行计划树等分布式查询优化方法,利用相应归并和优化算法将全局空间查询转化为各个场地局部空间数据库的具体执行计划,消除分布式查询中的冗余计算,优化查询计算策略,从而解决分布式空间查询中的处理代价高的问题。通过分布式空间查询实验表明,本文的算法能够较好地提高分布式空间查询的性能。
Due to complex data structure, complicated spatial relationship and massive data volume, distributed spatial query is a time-consuming processing, which will cause high transmission and processing cost. Query pro-cessing method in traditional distributed database cannot satisfy the demands of query in distributed geospatial database. Therefore, new query methods in distributed geospatial database need to be studied. In this paper, the distributed spatial join query processing is deeply studied based on the existing optimizing methods of the con- ventional query processing in traditional distributed database, and a series of transformation rules of relational al- gebra expression based on cross-border topological join optimization rules are proposed. The processed query tree is optimized by equivalent transformation after data localization. The global optimized method of distributed spatial join query for different fragments is studied. The global spatial query can be transformed into some local fragments joins effectively. The spatial join query is processed in the local area, avoiding the data transmission of spatial data among data nodes during the processing of query, so that the query performance can be improved. To improve the efficiency of the method, some new concepts were put forward, including query merged tree and ex- ecution plan tree, which can optimize the executing path of query plan. For example, by adjusting the executing order, some processes with low cost execute first, and the time-consuming processes execute based on the result set generated by the previous processes so as to reduce the process of time-consuming parts and resolve the prob- lem of high cost of query processing to improve the performance of distributed spatial query. The experiment based on the vector data of China shows our methods can reduce the cost of the spatial join and data transmis- sion among the nodes, and the performance improve 28.5%, which demonstrates that our methods outperform the traditional methods in terms of