用户发布的微博内容能够体现用户兴趣,微博中用户的转发、评论、回复、他人评论等微博行为对用户兴趣具有很强的指导作用。为了有效利用用户微博行为,提出了一种基于有指导LDA(latent dirichlet allocation)的微博内容用户兴趣建模方法。首先通过分析对微博的转发、评论、回复、他人评论这4个因素对用户微博兴趣主题的影响,定义了4种约束关系;然后基于用户微博内容,将4种约束关系融合到LDA模型中构建有指导的LDA微博主题生成模型,最后得到用户的微博主题分布,从而获得用户兴趣模型。实验结果表明,相比LDA模型,该方法的准确率有很大提高,引入4种信息对微博用户兴趣发现有非常重要的指导作用。
The content of users M icro-blogging can reflect users' interests. Forwarding,commenting,replying and other behavior about M icro-blogging have a strong guiding role to discovering users' interests. In order to using M icro-blogging behavior effectively,we proposed users' interest modeling method based on supervised-LDA M icro-blogging contents. First of all,through analyzing the impact elements,including forwarding,commenting,replying,and other behavior,four constraint relations were defined. Second,based on the contents of M icro-blogging,the four constraint relations were put into the LDA model and the supervised-LDA M icro-blogging theme generation model were constructed. And then the distribution of the users' theme and the users' interests' model were obtained. The experimental results showthat compared with the LDA method,this model has high accuracy,and the four introduced guiding information have a significant role in discovering M icro-blogging users' interests.