研究网络在线评论的倾向性分类能够及时了解民众对当前事件、热点话题的态度和心理状态,从而为相关领域的决策提供依据。针对网络在线电影评论倾向性分类问题,提出了基于网络词语扩展及属性约简的解决算法,该算法利用相关度测量对垃圾评论进行剔除,针对网络语言自身特点对其属性进行扩展,使用词频和信息增益分两步进行特征选择,构建特征属性进行分类。实验结果表明,使用该算法后,分类准确率等各项指标得到了提高。
The research on online comments can promptly understand the public’s attitudes and mental states to current events and hot topics,so it can provide basis for the decision-making for the relative fields.In this paper,an algorithm based on extension of network words and feature selection is proposed to solve the tendency of online movie comments.The garbage comments are eliminated using relevancy measurement,and then features are extended according to the characteristics of online comments.The features are selected for classification based on frequency of words and information gain.The results show that after using this method,the accuracy and other indexes of classification are improved.