以新浪微博为研究对象,基于微博主题及用户特征,提出社交网络中的用户转发行为预测算法.首先,基于互信息理论,从已发生转发行为的用户的微博内容中提取特征,通过分析给定用户的微博内容与特征之间的相关程度,预测用户是否会对给定主题的微博发生转发行为;然后通过研究用户性别、粉丝数、关注数、微博数与用户转发行为的关系,选取合适的用户特征描述,并基于贝叶斯模型预测给定用户对微博的转发概率.最后,结合以上2种算法的预测结果,得到给定用户对某主题微博的转发行为预测.该预测算法对研究网络舆情传播及微博营销具有重要意义.
Based on the tweet's topic and user's characteristics on Sina Weibo, a prediction algorithm for user's retweet behavior in social network was proposed. Firstly, use mutual information theory to extract features from retweeted users' tweet content. Compute the relevance between extracted features and given user's tweet content to predict the user's retweet behavior. Then study the relationship between user's retweet behavior and user's other characteristics such as gender, number of friends, number of followers, and number of tweets to select proper user characterization. Use user characterization and Bayesian model to predict a given user's retweet probability. Combining the results from the above two methods to make a final prediction of user's retweet behavior on a tweet with given topic. The prediction algorithm is of great significance in studying the spread of internet public opinion and microblogging marketing.