为保证在不牺牲精度的前提下加快大规模图聚类速度,将稀疏化思想引入图聚类中,在大图聚类之前增加一个稀疏化图的环节,稀疏化之后的图能够很好地保持原始图中各类结构,可实现在更小规模数据集上进行图聚类以提高运行速度。针对DBLP数据集构成的图,分别在原始图和稀疏化图上使用k-medoids图聚类算法,比较其运行时间和聚类精度,实验结果表明,在稀疏化图上进行聚类,可大大缩短运行时间,聚类精度并没有降低,实验分类情况和实际情况相吻合,取得了很好的聚类效果。
To accelerate the speed of large-scale graph clustering without sacrificing accuracy, the idea of sparsification is introduced to graph clustering, a graph sparsification link is added before large scale graph clustering. The cluster structure of the original graph is preserved very well in the sparsified graph, so that it can realize graph clustering on a smaller scale dataset to improve speed. With example, K-medoids graph clustering algorithm is applied to partition the original graph extracted from the DBLP data and the sparsified graph respectively, the running time is compared with clustering accuracy. Experimental results demonstrate that in the sparsified graph clustering, the running time is shortened without sacrificing quality, as well as experi- mental clustering is consistent with the actual situation, good effect of clustering is achieved.