设计了一种基于主题与连接的局部社区划分算法。该算法结合节点的主题相似度和连接相似度,综合计算节点间的相似度。同时算法采用局部思想,避免了寻找初始的中心节点。此外,该算法引入了局部模块度作为社区划分的结束判断条件。该算法被应用到参与"海地地震"相关话题讨论的Twitter微博用户数据集上,并与单纯基于链接、单纯基于主题以及基于主题和链接的社区划分算法在同样数据集的划分结果进行对比,结果表明:从纯度和熵的评估角度看,本文算法更具优越性。
Abstract: A community partition algorithm is designed based on theme and connection. Both theme and connection similarity of nodes are integrated in the algorithm, which also adopts a localized way to avoid the searching of good initial nodes. In the proposed algorithm, local modularity is accepted as a termina- ting condition of community partition. The algorithm is applied to a set of Twitter users who had joined into the topics related to Haiti earthquake. Three baseline community partition algorithms, i. e. , an al- gorithm simply based on link, an algorithm simply based on topic, and an algorithm based on both topic and link, are also applied to the same data set. Experiment results show that the proposed algorithm is more advantageous than the three baseline algorithms according to the measurement of purity and entropy.