空间离群是指空间数据集中那些非空间属性值与邻域中其他空间对象明显不同的空间对象。空间数据一般按地理分布存储具有海量特性,传统的集中式处理模式不能满足海量数据处理的效率和空间数据本身的安全性等要求。因此,在研究小组开发的地理知识服务网格平台GeoKS-Grid的基础上,本文针对分布式空间离群挖掘,提出了一个基于网格的分布式体系框架,制定了网格环境下分布式空间离群挖掘的策略,实现了具体的分布式空间离群挖掘算法。另遵循分布式空间数据挖掘的一般过程和网格服务通用、可重用和可组合的原则,将算法按合理粒度进行分解,并封装成多个基本的原子服务,进而以网格工作流的方式进行服务发现与组合,完成包括局部离群挖掘和全局离群挖掘在内的分布式空间离群挖掘。最后,通过福建省生态地球化学调查土壤数据离群分析实例,验证了服务或系统的合理性和有效性。
A spatial outlier is a spatial object whose non-spatial attribute values are significantly deviated from the other data's in the dataset.The identification of spatial outliers can lead to the discovery of some unexpected knowledge,and it has a number of practical applications.There are massive spatial data maintained over geographically distributed sites in WAN.It's necessary to analyse and process the data by using the high-performance distributed parallel processing system.Grid is one of the most effective approaches to meet this requirement.The geographical knowledge grid platform(GeoKS-Grid) established by our research group is the application of knowledge grid in geo-information science,which integrate technologies of grid computing,web service,WebGIS,data mining,information visualization,knowledge base of ontology and knowledge reasoning,online analytical processing,decision analysis,data warehouse and workflow,to form a geographical problem solving environment.In this paper,a grid based distributed framework and the corresponding strategy for distributed spatial data mining system are discussed,and a distributed algorithm for spatial outlier mining is designed and implemented.In general,the process of distributed spatial outlier mining can be seen to be a series of services including atomic services and composite services.Furthermore,according to the principle of web service reusage and compositionality,the distributed spatial outlier mining algorithm is decomposed into several grid atomic services.Distributed spatial outlier mining including local spatial outlier mining and global spatial outlier mining is realized by grid workflow approach to discovery and composition of knowledge atomic grid services provided by knowledge grid.Finally,demonstration application is carried out on the basis of soil geochemistry data inspected by the Ecological Geochemistry Survey of Fujian Coastal Economic Belt,the efficiency and the validity of the distributed spatial outlier mining service and system are verified