研究了基于图压缩的最大Steiner连通k核查询处理,提出了一种支持最大Steiner连通k核查询的图压缩算法SC,证明了基于SC压缩算法的查询正确性.由于最大Steiner连通k核查询仅需要找到符合要求的连通区域,提出了图压缩算法TC,进一步将压缩图压缩为树.证明了基于压缩树的查询正确性,并提出了线性时间的无需解压缩的查询处理算法.真实和虚拟数据上的实验结果表明:压缩算法平均可将原始图压缩掉88%,且对于稠密的原始图,压缩算法的压缩效果更好,可将原始图压缩掉90%,与在原始图上直接进行查询处理相比,基于压缩图的查询处理算法效率更好,平均提升了1~2个数量级.
This paper focuses on maximum Steiner connected k-core query processing based on graph compression, and proposes a maximum Steiner connected k-core query preserving graph compression algorithm, SC. The correctness of querying based on SC algorithm is proved. Since maximum Steiner connected k-core query only requires a connected component which satisfies certain properties, graph compression algorithm TC is proposed to further compact the compressed graph into a tree. It is proved that querying based on the compacted tree is correct. A novel linear query processing algorithm which is able to query on the compacted tree without decompression is also introduced. Experiments on both real and synthetic datasets demonstrate that the compression algorithm could compress the original graph by 88% in average, and for denser graphs, the compression algorithm achieves better compression ratio, reducing the original graph by nearly 90%. Comparing with the query processing on original graphs, the query performance on compressed graphs is better, and in average, it could be 1 to 2 orders of magnitude times better.