近年来,文本的情感分析一直都是自然语言处理领域所研究的热点问题;微博作为一种短文本,用词精炼而简洁,富含观点、倾向和态度。因此,识别微博的情感倾向具有重要的现实意义。提出一种基于SVM和CRF的情感分析方法,使用多种文本特征,包括词、词性、情感词、否定词、程度副词和特殊符号等,并选用不同的特征组合,通过多组实验使情感分析效果最优。实验显示,选用词性、情感词和否定词的特征组合时,SVM模型的正确率达到88.72%,选用情感词、否定词、程度副词和特殊符号的特征组合时,CRF模型的正确率达到90.44%。
In recent years,the text sentiment analysis has always been a hot issue in the field of natural language processing.As a short text,micro-blog is featured of refined and concise,rich in views,tendencies and attitudes. Thus,the identification of emotional tendencies has important practical significance. This paper proposed a method of sentiment analysis based on SVM and CRF,used various features including word,speech,emotional word,negative word,adverb of degree and special symbols. They designed different combinations of features to make the effect optimal through multiple sets of experiments. The accuracy of SVM reached 88. 72% using the combination of speech,sentiment word and negative word,while CRF attained90. 44% selecting the combination of sentiment word,negative word,adverb of degree and special symbols.