针对当前方法难以获取评论文本全局情感倾向性的问题,提出一种基于潜在狄利克雷分布(LDA)模型的多文档情感摘要方法.该方法首先对给定的句子进行情感分析,抽取带有主观性评价的句子;然后,应用LDA模型表示已抽取的句子,并通过词汇的重要度和句子的特征计算句子的权重;最终提取情感文摘.实验结果表明,该方法能够有效地识别情感关键句,在准确率、召回率和F值上均有不错的效果.
It is difficult for the existing methods to get overall sentiment orientation of the comment text. To solve this problem, the method of muhi-document sentiment summarization based on Latent Dirichlet Allocation (LDA) model was proposed. In this method, all the subjective sentences were extracted by sentiment analysis and described by LDA model, then a summary was generated based on the weight of sentences which combined the importance of words and the characteristics of sentences. The experimental results show that this method can effectively identify key sentiment sentences, and achieve good results in precision, recall and F-measure.