[目的/意义]为提高引文网络的社团划分的准确性,提出一种基于加权的引文网络的社团划分方法。[方法/过程]以Louvain社团划分方法为算法基础,将科学论文用向量空间模型表示,利用改进的余弦相似度方法计算相邻论文之间的相似度,并将其作为权重,综合考虑论文内容属性与结构属性,提出一种基于样本加权的引文网络社团划分方法。[结果/结论]该算法将引文网络中论文的文本内容属性与拓扑结构属性结合起来,通过对Scientometrics期刊发表的论文以及主题为CRISPR的论文进行社团划分研究实验,结果表明该方法能改善引文网络社团的划分效果。
[ Purpose/significance ] The study of community discovery has great value for text mining. In order to improve the accuracy of the communities of citation networks, this paper describes a new community discovering algorithm for literature based on weighted networks. [ Method/process] The algorithm was based on the "Louvain community detecting algorithm", and established the vector space model to calculate the similarity of the adjacent papers as the weight of the link. Finally, based on the weighted network, the authors detected the community structure of the network. [ Resuit/conclusion] Experiments show that the proposed algorithm is an effective solution to improve the performance of community detection.