文中以新浪微博为研究对象,以分析新浪微博的信息转发与传播特征为研究目的,并对传播行为进行预测。在获取大量新浪微博在线数据的基础上,对各种可能影响用户转发行为的因素进行统计、分析,挖掘各种影响因素特征并进行建模。提出基于用户属性、社交关系和微博内容三类综合特征,使用机器学习的分类方法,对给定微博的用户转发行为进行预测。基于微博网关注关系拓扑,利用概率级联模型对给定微博的转发路径进行预测,为预测微博的影响范围提供依据。文中通过实验分析了新浪微博符合复杂网络特征、社交类特征对转发行为有重要影响,并验证了传播预测的有效性。
In this paper,research is conducted on Sina microblog for the purpose of analyzinginformation forwarding and propagation characteristics,as well as predicting propagation behavior.Based on a large number of online data from Sina microblog,a variety of possible factors thataffect users’retweeting behavior have been analyzed and various features have been mined andmodeled.Three comprehensive features,based on user attributes,social relations and microblogcontents,are used to predict users’retweet behavior by machine learning classification algo-rithms.The microblog topology graph on following relation is constructed,and the cascadeprobability model is used to predict the propagation paths of a tweet,then a tweet’s influence canbe predicted.Experiment indicates that Sina microblog meets complex network characteristics,and social characteristics have a greater influence on forwarding behavior.Furthermore,it verifiesthe validity of propagation prediction.