近年来,随着越来越多的大科学装置的建设和重大科学实验的开展,科学研究进入到一个前所未有的大数据时代.大数据时代科学研究是一个大科学、大需求、大数据、大计算、大发现的过程,研发一个支持科学大数据全生命周期的数据管理系统具有重要的意义.分析了研发科学大数据管理系统的背景,阐述了科学大数据的概念和三大特征,通过对科学数据资源发展和科学数据管理系统的研究进展进行综述分析,提出了满足科学数据管理全生命周期的科学大数据管理框架,并从数据融合、数据实时分析、长期存储、云服务体系以及数据开放共享机制5个方面分析了科学大数据管理系统中的关键技术.最后,结合科学研究领域展望了科学大数据管理系统的应用前景.
In recent years,as more and more large-scale scientific facilities have been built and significant scientific experiments have been carried out, scientific research has entered an unprecedented big data era.Scientific research in big data era is a process of big science,big demand,big data,big computing,and big discovery.It is of important significance to develop a full life cycle data management system for scientific big data.In this paper,we first introduce the background of the development of scientific big data management system.Then we specify the concepts and three key characteristics of scientific big data.After an review of scientific data resource development projects and scientific data management systems,a framework is proposed aiming at the full life cycle management of scientific big data.Further,we introduce the key technologies of the management framework including data fusion,real-time analysis,long termstorage,cloud service,and data opening and sharing.Finally,we summarize the research progress in this field,and look into the application prospects of scientific big data management system.