为解决传统行存储结构导致OLAP聚集计算效率低下的问题,设计了基于维存储的OLAP数据存取技术.首先,将OLAP事实表中的维属性集和度量属性集定义为2个列族,每张维表的所有属性定义为1个列族.对维表进行二进制编码,生成维层次编码,从而保持了维的层次语义特性.以(维层次编码,度量值)对形式按列组织数据,消除查询时维表与事实表的复杂连接操作运算.然后,采用自底向上方法构建B+树,对维层次编码进行索引,加快了数据读取效率.通过增删事实表和维层次编码一度量表中相应的列,实现维和度量的增加和删除.性能分析结果表明,这种OLAP数据存取技术具有良好的可扩展性,能高效地管理和存取OLAP海量多维数据,有效支持上层OLAP聚集计算.
To improve the efficiency of OLAP( online analytical processing) aggregation in tradition- al row-stores, a dimension-stores based OLAP data access technology is designed. First, the dimen- sion attributes and measure attributes of the OLAP fact table are defined as two column families, and all attributes in a dimension table are defined as one column family. Dimension tables are encoded by using binary digit to generate dimension hierarchy codes, and thus the hierarchy semantics of dimen- sion is maintained. The data are organized as (dimension hierarchy code, measure value) to elimi- nate the complex join operations between dimension tables and fact table in query. Then, a B + tree index is bottom-up built to index the dimension hierarchy codes, which accelerates the data access efficiency. Adding and deleting the corresponding columns in fact table and dimension hierarchy code-measure tables can realize the addition and deletion of dimensions and measures. The performance analysis results show that this OLAP data access technology has good expandability. It can effi- ciently manage and access massive OLAP multidimensional data, and effectively supports the upper OLAP aggregation.