随着互联网和信息技术的迅速发展,网络上用户的评论信息越来越多。利用计算机技术分析网络中大规模文本的情感倾向,在政府的舆情分析和企业的产品评价智能回馈等应用中有着非常巨大的发展前景。文中着重研究了选取不同的文本特征对文本情感倾向性分类精度的影响。实验中所研究的不同文本特征主要包括情感词、形容词、副词、语气词和标点符号等。实验结果表明,选取情感词、形容词、副词作为特征项对情感分类具有较好的效果,在此基础上添加语气词和标点特征可以有效地提高情感分类的精度。该研究成果可用于社会舆情分析、垃圾博客过滤、商品评论与推荐、影视评价等领域。
With the rapid development of the Internet and information technology, the online comments of users are also increasing. Using computer technology to analyze emotional tendencies of large-scale network texts in the government' s public opinion analysis and evaluation of the company' s product applications such as intelligent feedback has enormous development prospects. Mainly study the influence of selecting different text features on the final classiflcation accuracy of sentiment classification in this paper. Different text features studied in the experiment include emotional words, adjectives, adverbs, modal and punctuation. The experimental results show that selecting emotional words, adjectives, adverbs as feature items on sentiment classification can achieve good classification performance, and adding modal and punctuation features can effectively improve the sentiment classification accuracy. The research findings can be applied to social public opinion analysis, filtering spare blog, commodity reviews and recommendations, film evaluation and so on.