提出将自发地理信息(VGI)集成到中文数字地名词典(CDG)的架构。在构建VGI数据爬取模型(VDCM)和地名本体的基础上, 针对从VGI数据中提取地名信息过程中出现的地名geo/geo歧义(多个地理位置对应同一地名)、地名与经纬度匹配错误、多个资源对应同一地名等问题提出相应的解决方案:使用上级行政区划名排歧、参考行政区划本体层次筛除错误地名标签、对同一地名的多个资源进行空间聚类获得唯一的经纬度, 并总结该架构能有效地解决CDG不支持空间推理、数据孤立以及数据维护、更新受限等问题, 提出未来的工作围绕修正错误的地名标签、构建信任模型及将VGI集成于分布式CDG展开。
This paper proposed a model of integrating VGI into Chinese digital gazetteer (CDG), which could solve problems of CDG including couldn't support spatial reasoning, data isolation and data maintenance and update restricted. First, it constructed VGI data crawling model (VDCM) and toponym ontology. Second, it analysed the process of extracting place name information from VGI data and found out three problems: place name geo/geo ambiguity, mismatching of place name and geographic footprint, and the spatial-temporal of place name. Then, it presented solutions to solve these problems, including using superior administrative divisions to solve disambiguation, referring administrative divisions level ontology to remove wrong tags and clustering different sources of the same place name. Last, it concluded the paper and proposed the future work was in 3 aspects: modify wrong name tags, construct trust model and integrate VGI into distributed CDG.