HBase是一种面向亿级规模的分布式键·值数据库,它能够提供在海量数据情况下的高效读写操作.然而由于HBase仅提供键-值模式的查询,因此无法满足面向时空应用的查询.现有工作的问题,第一,没有考虑时间维这一经常性的查询维度,第二,基本都是从设计I-IBase的schema出发设计rOWkey来满足多维查询,这不能从根本上提高检索性能.针对这些不足,充分研究了HBase的内部索引机制,提出基于meta机制、适合于时空检索的HST结构,利用了merit链表索引了空间和时间,在此基础上设计了时空范围查询和kNN查询,以及对应的并行算法.在真实数据集上进行实验,结果表明,相比较于现有工作,基于HST的HBase时空检索能力明显提高,能够支持HBase应用于海量时空数据查询.
HBase is a distributed key-value database on billions scale, which provides an efficient way of reading and writing massive data. However, I-IBase query capability is limited in that it only provides key-value mode query, which cannot satisfy the need of queries of spatio-temporal applications. The existing works firstly do not take time dimension into account. Secondly, the previous imple- mentation of multidimensional and spatial queries is to design schema of HBase for the row key, which does not fundamentally im- prove retrieval performance. To overcome the deficiencies of HBase, this paper fully studies on the inner index schema of HBase and proposes HST structure based on meta table for spatio-temporal queries,in which temporal and spatial index are indexed by meta table respectively. Based on this structure,algorithms of spatio-temporal range query and kNN query are designed,as well as the correspond- ing parallel query algorithms. Experiments on real datasets have shown that spatio-temporal query based on HST can provide more.effi- cient retrieval services in I-IBase for massive spatio-temporal data than the previous works.