针对网络上大量充斥的搜索排名作弊行为,提出基于链接分析并具备反作弊功能的网络排名算法。在初始黑名单条件下,根据页面之间的链接关系,引入作弊倾向性和关联性2个概念,来衡量一个页面作弊的可能性。在此基础上,构造了惩罚因子,并对PageRank的值进行修正,实现新的排名顺序。该算法能够将权威性较高、作弊可能性较低的页面呈现给用户,提高用户的搜索效率。以3537379个网页8456740条链接为素材,对算法的反作弊性能进行实验。结果显示,与PageRank和TrustRank算法相比,该算法的反作弊性能有了明显地提高。
In the view of a great number of cheating technologies, we propose an anti-cheating sorting algorithm based on link analysis. Based on an initial blacklist which contains a small set of identifiedcheating pages, the penalty factor is created to evaluate a page from two aspects, namely fraud tendency and the authority. According to the penalty factor, we re-evaluate pages PageRank and sort pages by thesenew values. By using this algorithm, we can present pages with relatively high quality and low or even no cheating tendency to users, in which way users' searching efficiency is improved. In the experiments, wetested the anti-cheating performance of this algorithm based on 3537379 pages and 8456740 links. The result indicates that, compared with the PageRank and TrustRank algorithms respectively, the anticheating performance of our algorithm is considerably enhanced.