为了快速获取网络文本中主题内容和情感信息,提出了文本情感文摘的概念,同时提出了一种基于条件随机场模型的情感文摘提取方法.首先提取文本中的句子长度、提示词以及情感词语作为基本特征,同时应用浅层狄利赫雷分配的主题模型,分析文本潜在主题信息,提取主题特征,将这两类特征同时应用到条件随机场模型中,从而获取文本的情感文摘.实验结果表明,该方法细腻刻画了文本的主题信息,同时考虑了文本主题的情感色彩,文摘提取效果较理想,能满足用户的实际需要.
To quickly obtain information of theme and sentiment in network texts, the concept of sentiment summarization is introduced in order to meet the practical needs, and while a method of extracting the sentiment summarization is proposed based on the Conditional Random Fields model. Firstly, extract the basic features, such as sentence length, tips word and sentiment words in the sentence, while the Latent Dirichlet Allocation model is used to analyze the underlying themes of the text and extract theme features. Then these two kinds of features are used in the Conditional Random Fields model and sentiment summarization is obtained. Experimental results show that this approach has better efficiency because it takes into account the information of themes in the text, and it can meet the user's needs.