个人网络由于规模小、信息量大的原因,成为社交网络分析中重要的研究对象,而现有的社区发现算法主要集中在全局大规模网络上,已有研究表明,全局网络的社区性质并不明显.文中提出了一种个人网络主题圈子发现算法,引入信息熵的概念衡量个人网络中用户圈子是否具有共同的主题,定义了新的目标函数,通过对目标函数进行启发式过程优化,实现了对用户个人网络主题圈子的挖掘和发现.并对微博文本进行主题提取,抽取出用户的主题兴趣,使用信息熵对用户主题的分布进行评估.然后,利用调和因子对结构性质函数与信息熵函数进行融合,给出了结合信息熵与结构模块性的目标函数.最后,对提出的目标函数进行近似,求得最优解,得到个人网络中的主题圈子.在新浪微博数据集上的实验结果表明,新算法能够有效地在个人网络上挖掘出具有文本高度聚合性的主题圈子,并且各个圈子在结构上具有高内聚低耦合的性质,对个人网络的分析和研究具有较大的应用意义.
Due to small scale, large amounts of information, the ego network has become a very important research area. Present community detection algorithms focus mainly on the global large scale network, however existing researches have indicated that the community structure is not obvious as expected on the global network. In this paper a novel circles detection algorithm is proposed, which is devoted to finding the circle structure in the ego network. The proposed algorithm defines a new object function, and the detection of circles could be conducted via optimization of the function heuristically. First, this paper extracts topic distribution from the user generated text, and introduces information entropy to evaluate user topic distribution. Then, the harmonic factor is used to combine structure function and entropy function, which leads to the object function. Finally, the optimization of the object function gives the solution for circle detection. Extensive experiments on weibo dataset demonstrate that the proposed algorithm can effectively mine topic-related circles.