中文产品评论特征词与关联的情感词的分类是观点挖掘的重要研究内容之一。该文改进了英文依存关系语法,总结出5种常用的中文产品评论依存关系;利用最大熵模型进行训练,设计了基于依存关系的复合特征模板。实验证明,应用该复合模板进行特征-情感对的提取,系统的查全率和F-score相比于传统方法,分别提高到78.68%和75.36%。
In recent years, feature-opinion pairs classification of Chinese product review is one of the most important research field in Web data mining technology. In this paper, five types of Chinese dependency relationships for product review have been concluded based on the traditional English dependency grammar. The maximum entropy model is used to predict the opinion-relevant product feature relations. To train the model, a set of feature symbol combinations have been designed by means of Chinese dependency. The experiment result shows that the recall and F-score of our approach could reach 78.68%and 75.36%respectively, which is clearly superior to Hu’s adjacent based method and Popesecu’s pattern based method.