随着大型地面和空间观测设备的建设以及大型巡天项目的开展,天文数据以TB字节、PB字节,甚至EB字节计量,天文学进入了"大数据"时代。面对数据海洋,如何有效地存储和管理这些大数据是摆在天文学家面前的核心问题。数据存储和管理不仅仅是天文数据中心的任务,天文学家也需要有效地管理自己的科研数据。能够将海量的数据自动地存入数据库中是管理数据的基本前提,而高效的数据索引则是管理数据的核心要素,为此设计开发了天文大数据管理工具AutoDB,使用虚拟终端监视实现海量数据的自动入库,对数据自动创建全新的天空分区索引Q3C(Quad Tree Cube),对天文数据进行二维空间索引以便于高效的管理。天文大数据管理工具的改进和完善对天文学家后续研究中的数据融合、数据分析、数据挖掘提供了根本的保障,尤其对从事大数据的天文学家,拥有自动化的数据库管理工具,可以集中精力致力于科学研究。
As more large ground-and space-based observation equipments enter into service and more large-area sky survey projects progress, astronomical data are increasingly measured in terabytes, petabytes, or even exabytes.With astronomy entering the‘massive-data’ era (‘facing a data ocean’ ) , how to effectively store and manage the huge-quantity data becomes a central issue for astronomers.Astronomical data centers certainly need to store and manage massive data.Individual astronomers need to effectively manage their large amounts of data as well.The most basic task of management of data in huge amounts is to efficiently and automatically deposit data into a database.Moreover, efficient indexing of data is the key issue in application of data management.We have thus designed and developed a software tool package called AutoDB to manage massive astronomical data.In the AutoDB there is a virtual terminal for a user to monitor automatic storage of data.With the Q3C the AutoDB automatically creates new indexing based on sky partitioning and applies the technique of indexing in a two-dimensional space to effectively manage astronomical data.Improvement of data-management tools such as in our study can provide a sound basis for follow-up data fusion, data mining, and data analysis carried out by astronomers.Especially for astronomers using massive data, improved data-management tools allow them to focus on exploring scientific issues in their research.