为了能够有效地识别水军,在以往相关研究基础上,设置粉丝关注比、平均发布微博数、互相关注数、综合质量评价、收藏数和阳光信用这6个特征属性来设计微博水军识别分类器,并基于贝叶斯模型和遗传智能优化算法实现了水军识别算法。利用新浪微博真实数据对算法性能进行了验证,实验结果表明,提出的贝叶斯水军识别算法能够在不牺牲非水军识别率的情况下,保证水军识别的准确率,而且提出的阈值优化算法能显著提升水军识别的准确率。
In order to distinguish the spammers efficiently, a classifier based on the behavior characteristics was estab- lished. By analyzing the previous research, the ratio of followers, total number of blog posts, the number of friends, com- prehensive quality evaluation and favorites according to latest data set, the Weibo spammers' identification algorithm was realized based on Bayesian model and genetic algorithm. The experiment result based on the real-time data of Sina Weibo verify that the Bayesian model recognition algorithm can ensure spammers recognition accuracy without sacrificing rec- ognition rate ofnon-spammers, and the proposed threshold value matrix proposed optimization can significantly improve recognition accuracy navy.