长期以来,各水利单位与业务部门从自身发展实际出发,建设了一大批水利信息化业务应用项目,并积累下丰富的水利数据资源,总量已超2.5PB。由于这些数据的采集与使用一直依赖于不同的业务系统,数据不仅分散在水利部、七大流域、31个省区(直辖市)和新疆建设兵团的数据中心或不同业务部门,同时形式异构,业务间交叉冗余、语义冲突,严重制约了水利领域大数据高效共享与使用。面对跨行业跨部门的结构化、半结构、非结构化水利数据共享需求,提出发展基于分布式目录的海量异构水利数据共享技术,构建面向水利部/流域/省区的水利大数据共享平台,从而使全国范围内水利数据非重构高效共享成为可能。研究能够充分利用既有水利信息化建设成果,是“十三五”期间推进“数字水利”向“智慧水利”积极发展的重要基础性工作之一。
The Water Management Authorities have long been urged to establish different water information management systems,to fulfill their own requirements.During this period,massive water data with the overall amount more than2.5PB,has been collected but stored dispersedly in different data centers or different business departments located on the Ministry of Water Resources,7major basins,31provinces and Xinjiang Construction Regiment.Such data is not only various on their structures and storage locations,but is also cross redundant on its semantic expressions and authorization clarification.It severely restricts the sharing and using big data in the field of water conservancy.Facing the water data sharing demand of structured,semi-structured and unstructured and cross-industry and crossdepartment,the article puts forward developing the mass heterogeneous water resources data share technology based on distributed catalog.In order to share such data nationally,over the Ministry of Water Resources,the river basin management agencies and the provincial institutes,it is necessary to establish a national big water data sharing service platform,so that non-refactoring and efficient sharing of water data across the country becomes possible.The research makes full use of existing water conservancy informatization construction achievements.It is one of the important basic works to turn the“digital water conservancy”to“intelligent water conservancy”during the13th Five Year.