近年来社交网络日益火热,基于社交网络的数据挖掘也随之兴起。链接预测作为网络数据挖掘的重要课题,其借助已知的网络结构等信息来预测和估计尚未链接的两个节点间存在链接的可能性。社交网络的链接预测可以用于好友推荐,过滤冗余信息,提高用户的满意度、忠诚度,建立一个健康的社交网络环境。已有的链接预测算法集中研究网络结构信息或网络节点属性,以分析网络全局或局部特性。文章考虑到微博社交网络的本质,提出了融合多特征的链接预测方法,其中包括节点特征、拓扑特征、社交特征以及投票特征。基于这些特征,在微博社交网络数据上应用SVM、朴素贝叶斯、随机森林和逻辑回归4种机器学习算法训练预测模型,预测潜在的社交链接。结果表明,文章提出的组合特征相对于传统特征表现更好,且融合多种特征能够提高最终的预测精度。
In recent years,social networks have become increasingly hot,and data mining basedon social networks has also arisen.Link prediction(LP)is an important topic of network data mining,which uses the known network structure and other information to predict and estimate the possibilityof linking between two nodes that are not yet linked.Link prediction in social network can be usedto recommend friends,fi lter redundant information,improve user’s satisfaction and loyalty,and builda healthy social networking environment.In previous researches,attentions are focused on structureinformation or node attributes,in order to analyze the global or local properties.Considering the naturesof microblog social network,this paper proposes a link prediction method combining multiple featureswhich includes node features,topological features,social features and voting features.Based on thesefeatures,4machine learning algorithms,SVM,naive Bayes,random forest and logical regression,areapplied on microblog social network data to train predictive models to predict potential social links.The results show that combining multiple features performs better than the traditional features,and thecombination of multiple features can achieve highest accuracy.