以用户兴趣理论和用户之间的关注行为为基础, 结合时间因素在微博热门话题检测中的重要作用, 研究了如何有效获取微博中最新、最有价值的话题问题, 基于PageRank经典算法提出了一种带时间参数的热门话题检测算法(TimePageRank)。算法首先使用投票机制抽取出用户感兴趣的话题并记录话题的生成时间; 然后用权值计算公式计算每个话题的权值; 最后使用TimePageRank算法对这些话题进行排名, 从而检测出微博中的热门话题。真实数据集上的实验结果验证了该方法的高效性。
Combined with the important role of the time factor in the detection of hot topics, this paper studied how to effectively get the latest and the most valuable topic issues in the micro-blog based on the theory of user interest and the behavior between users, and proposed a hot topic detection algorithm (TimePageRank, which modified the PageRank algorithm) with a time argument. First, the algorithm extracted topics which were interesting to users by using the voting mechanism and recorded the generation time of the topic. Then, it calculated the weight of each topic. Finally, this paper used the proposed algorithm to rank these topics to detect hot topics in the micro-blog. The experimental results over real data set illustrate the effectiveness and efficiency provided by the algorithm.