本文提出了潜在狄利克雷分布模型与自然语言处理技术相结合的一种挖掘用户评论热点的方法。为验证该方法的有效性,以22157篇餐馆评论为样本,利用Gibbs抽样计算模型参数,获取了评论热点及相应的热点词语。实验获得的9个主题内容较好地反映了餐馆评论中的热点,与现实生活中用户所关心的餐饮热点基本吻合,表明该模型具有较好的热点识别效果。
This paper presents an approach to mining the hot topics of user comment which combines the Latent Diriehlet Allocation (LDA) model with natural language processing technologies. To verify the validity of the proposed approach, 22 157 comments on restaurants are taken as samples to obtain the hot topics of user comment and their relevant hot words by the use of the Gibbs sampling-computed parameters. The obtained 9 topics reflect the hot topics of user comments on restaurants relatively satisfactorily, and are basically consistent with what the users care about in restaurants in their real life, which shows that this model has a good effect in mining hot topics.