信息网络数据立方(InfoNetCube)的计算是进行信息网络在线分析处理的基础.然而,不同于传统的数据立方,信息网络数据立方由多个子方体格组成,每个方体格中任意方体(cuboid)的任意单元格都包含一个主题图(或称图度量),因而空间开销较传统数据立方大2个数量级以上.如何快速、高效地进行信息网络数据立方的部分物化,是极具挑战的研究课题.提出了基于透析计算思想的信息网络立方物化策略,通过主题图度量在信息维和拓扑维上反单调性运用,提出了基于透析计算的空间剪枝算法,快速透析掉不可能命中的子图度量、方体单元、方体乃至方体格.实验结果表明,所提出的基于透析计算的部分物化策略可以对信息网络方体进行有效剪枝,算法较基于基本方体的部分物化策略运行时间平均降低75%.
Calculation of the information network data cube (InfoNetCube) is the foundation of information online analytical processing. However, different from the traditional data cube, InfoNetCube consists of multiple lattices in which each cuboid contains a topic graph (or graph measurement), thus the storage consumption overhead is two orders of magnitude more than that of traditional data cube. How to materialize the specified cuboids or lattice rapidly and efficiently in the information network is a quite challenging research issue. In this paper, a novel InfoNetCube materializing strategy for information network is proposed based on dialysis computing. By leveraging the anti-monotonicity of topic graph measurement in the information and topology dimensions, a dialysis based space pruning algorithm is constructed to rapidly dialysis out the hidden sub graph, cuboids and lattices. Experimental results show that the proposed partial materialization algorithm outperform the cube based partial materialization strategy, saving almost 75% aggregation time.