微博环境中用户可以为自己添加标签,用户所添加的标签往往被视为是对自身特点和兴趣的重要描述信息.标签中所包含的信息可能有助于建立精确的用户描述,因此在个性化推荐、专家检索、影响力分析等应用中有潜在的应用价值.首先,在大规模数据上分析和研究了微博中用户添加标签的行为及标签内容分布的特点;之后,通过主题模型对用户的微博内容进行分析,实验结果表明:用户的标签越相似,微博内容也越相似,反之亦然;随后,分析了用户关注关系与微博和标签内容之间的联系,实验结果显示,有关注关系的用户之间微博和标签的内容越相似;基于这个发现,分别使用标签内容和微博内容对真实微博数据中的用户关注关系进行预测,结果表明:基于标签的预测方法其效果明显优于基于微博内容的预测方法,显示出用户标签在描述用户兴趣方面的价值.
Weibo allows users to add text tags in their profiles, which are descriptive to one's personality and interests. The tag information can be very useful to user profiling in applications such as personalized recommendation, expert finding and social influence measuring. This paper first studies the characteristics of users' tagging behavior and content of the tags based on large-scale data. By adopting topic model on users' Weibo posts, it finds that the more tags two users have in common, the more similar their Weibo posts are and vice versa. It also finds that the users with connections to each other have more similar tags and Weibo posts. Based on this observation, this study uses tags and Weibo posts to predict user connections separately on real-world data. The experimental results show that the tag-based approach is significantly better than the approach based on Weibo posts, thus validating the effectiveness of user tags in describing user interests.