相比传统的社交网络,基于弱关系的微博类社交网络具有显著的异构特征.根据特征可以将节点分为用户(消息订阅者)和主题(消息发布者)两类,面向用户推荐其感兴趣的主题成为了该类社交网络中推荐系统的主要目标之一,同时该类社交网络中普遍存在的数据稀疏性和冷启动现象成为了推荐系统面临的主要问题.文中提出一种基于两阶段聚类的推荐算法GCCR,将图摘要方法和基于内容相似度的算法结合,实现基于用户兴趣的主题推荐.与以往方法相比,该方法在稀疏数据和冷启动的情况下具有更好的推荐效果,此外,通过对数据集进行大量的离线处理,使得其较以往推荐方法具有更好的在线推荐效率.最后通过真实社交网络的数据对本方法进行了验证,同时分析了各参数对推荐效果的影响.
Comparing to the ordinary social networks services (SNS), the twitter-like weak- relationship based social networks are observably heterogeneous. By classifying the nodes into users (subscriber) and subjects (publisher), the goal of recommendation systems over this kind of networks is basically recommending the subjects to the users for subscription. Moreover, the data sparseness and cold-start scene always exists in these microblog networks. In this paper, we propose GCCR, a hybrid method combining both graph-summarization and content-based algo- rithms by a two-phase user clustering approach, which can recommend subjects according to user interests. With respect to other methods, the GCCR algorithm could generate better recommen- dation result in sparse datasets and cold-start scenarios. In additional, by separating the task into offline and online parts, GCCR works more efficiently online by using the pre-processed offline results. We use real data set from existing social networks to evaluate GCCR along with base-line methods. Moreover, an analysis of the parameters is given for evaluating their impacts on recommendation results.