大数据时代的到来意味着新技术、新系统和新产品的出现.如何客观地比较和评价不同系统之间的优劣自然成为一个热门研究课题,这种情形与三十多年前数据库系统蓬勃发展时期甚为相似.众所周知,在数据库系统取得辉煌成就的发展道路上,基准评测研究一直扮演着重要角色,极大推进了数据库技术和系统的长足发展.数据管理系统评测基准是指一套可用于评测、比较不同数据库系统性能的规范,以客观、全面反映具有类似功能的数据库系统之间的性能差距,从而推动技术进步、引导行业健康发展.数据管理系统评测基准与应用息息相关:应用发展产生新的数据管理需求,继而引发数据管理技术革新,再催生多个数据管理系统/平台,进而产生新的数据管理系统评测基准.数据管理系统评测基准种类多样,不仅包括面向关系型数据的基准评测,还包括面向半结构化数据、对象数据、流数据、空间数据等非关系型数据的评测基准.在当今新的数据系统发展中,面向大数据管理系统的评测基准的研究热潮也如期而至.大数据评测基准研究与应用密切相关.总体而言,尽管已有的数据管理系统评测基准未能充分体现大数据的特征,但是从方法学层面而言,三十多年来数据管理系统评测基准的发展经验是开展大数据系统研发最值得借鉴和参考的,这也是该文的主要动机.该文系统地回顾了数据管理系统评测基准的发展历程,分析了取得的成就,并展望了未来的发展方向.
The arrival of big data era means the emergence of novel techniques,systems and products.How to compare and evaluate different database systems objectively becomes a hot research area,which is similar to the age when database systems were just flourishing thirty years ago.As well as we know,database benchmarking plays an important role in the development of database systems,and greatly promotes the development of database technology and systems.The database benchmark refers to a set of specifications to evaluate and compare different database systems,which is capable of reflecting the performance gap between various database systems objectively and comprehensively,so as to promote technological progress and guide the positive development of the industry.Database benchmark is closely related to the application developments:it describes new data management needs,sparks innovative data management theory,gives birth to new data management systems,and ultimately needs to develop appropriatebenchmarks for evaluation.There exist various kinds of database benchmarks,including that for relational databases,for non-relational databases(semi-structured data,object-oriented data,streaming data,and spatial data),and for big data most recently.Nowadays,the tide of the research on big data benchmarking is also coming.The research on big data is strongly related to application requirements.So far,existing work cannot fully reflects the distinctive characteristics of big data applications.From a technical point of view,the developments of database benchmarks in the past thirty years are of great help to develop big data benchmarks,which is the main motivation of this paper.This paper reviews the progress of database benchmarks systematically,and points out future directions.