用户通过检索平台虽然可以获得大量信息,但是搜索结果往往会出现主题漂移现象,不能满足用户的实际需求。为减少这种现象的发生,提出一种改进的PageRank算法。该算法基于传统的PageRank算法,先利用向量空间模型对页面间的相似度进行计算,然后依据相似度赋予不同的调控因子,并将它们引入到PageRank算法中,从而使得页面PR值的计算更加合理、科学。结果表明:改进后的PageRank算法在搜索应用中能够有效减少了主题漂移现象,搜索结果也更加符合用户需求。
Users can get a lot of information through the search platform, , but the theme drift phenomenon often appears to search results. Thus, users’ actual needs cannot be met. In order to reduce the occurrence of this phenomenon, an improved PageRank algorithm is proposed. The algorithm based on traditional PageRank algorithm, first applies the vector space model (VSM) to calculate the similarity between pages, then gives different regulatory factors according to the similarity, introduces them to the PageRank algorithm, and finally makes P R value calculation more reasonable and scientific. The result shows that the improved PageRank algorithm can effectively reduce the theme drift phenomenon in the search application, and the search results are more in line with users’ needs.