从科学家合作网络中发现隐含的研究社区对于理解科研人员的合作和交流模式,挖掘科研人员的研究兴趣具有十分重要的意义。本文在Latent Dirichlet Allocation模型的基础上,提出了一个社区一作者一主题模型,该模型能够根据科研人员之间的合著关系和论文的内容来发现隐性的子社区,并提取出每个子社区中的研究主题以及每个子社区中有代表性的科研人员。本文还给出了基于Gibbs抽样的社区一作者一主题模型的推断算法。在NIPS数据集上的实验表明,本文提出的社区发现算法所发现的研究社区和研究主题都是有效的。
Detecting latent research communities from scientific collaboration network is of great significance for understanding the cooperation and communication patterns of researchers and discovering their research interests. Based on Latent Dirichlct Allocation, this paper proposes a Community-Author-Topic model for community detection which incorporates both the co-authorship relations and the content information of papers co-authored by researchers. The proposed model can discover the research topics and representative researchers from each sub-community. A Gibbs sampling algorithm is also presented to do inference for Community-Author-Topic model. The experiment on NIPS dataset shows that the sub-communities and research topics extracted by the proposed model are meaningful.