随着地理信息技术与计算机网络技术的结合和发展,基于全球框架的地理信息服务对海量数据管理的要求越来越高,传统的单中心的关系数据库的管理模式已经不能满足要求。分布式文件系统、半结构化数据库和关系数据库技术优势互补,为海量数据高效管理提供了新的技术思路。本文提出了分布式环境下空间数据一体化存储管理架构,设计了矢量和栅格数据的数据逻辑组织和物理存储模型,通过统一的分层+分块的数据划分规则,实现了矢、栅数据分布式环境下的一体化管理。在该模型中,利用关系数据库和半结构化数据库的特点,对空间索引和实体数据分开管理,有效地提高了数据处理和访问效率。实验表明,该模型具有更高的数据管理能力,可为分布式环境下数据服务中心构建提供一个有效的解决方案。
With the combination and development of geographic information technology and computer network technology, geographic information services based on global framework demand for more efficient massive data management, the traditional single center relational database management mode is unable to meet the requirements. Since the distributed file systems, semi-structured databases and relational database technology have complementary advantages to each other, a new technical method for efficient management of massive data is developed. In order to achieve high effective geospatial data management, this paper presented an integrated architecture oriented to the storage and management of geospatial data in distributed environment, designed user – oriented massive geospatial data integration model and distributed storage organization model. In this model, a technical route combining the No SQL database and relational database is adopted, and a layered + partitioned data model and multi-level index mechanism for the rapid access of massive data is designed, so it can realizes the integrated management of vector and raster data in distributed environment. Because the model has taken advantages of relational database and semi-structured database, structured geographic information, spatial index and entity data can be managed separately and the efficiency of data processing and access is improved effectively. Vector data and raster data is the largest and most widely used geospatial data, In this paper, an experiment system is set up in the experiment environment, which realizes vector data and raster data management model. TB-level data are used to conduct experiments of data loading, index(pyramid) creation and concurrent data access efficiency, compared with the traditional data model, the model in the data management capacity, processing speed and access efficiency have greatly improved. The results show that the model can support the parallel operation in distributed environment, with a higher data management c