HITS是一种经典的链接分析算法,其主要问题是容易发生主题漂移。针对这一问题,提出了一种改进的算法:MCHITS。MCHITS利用最大流算法对HITS进行改进:首先将root集扩展两层,然后将root中的结点作为种子结点通过最大流最小割算法发现以root集为中心的社区,社区中的页面作为MC—base集。实验结果表明MCHITS提高了查询结果的相关度,减少了主题漂移的发生。
HITS is one of the classical link analysis algorithms,the main problem of it is the topic drift.In this paper,a new algorithm:MCHITS,which is based on maximal flow algorithm is prososed.It performs the expansion from root twice thus including pages which are link-distance two or less from at least one page in the root set,then by maximal flow-minimal cut algorithm finding a community that takes root set as the center,the pages in the community as MC-base set,The experimental results show that based on the MC-base,the iterative computation of Hits has a big improvement about the results,decreases the probability of the topic drift.