表情符号作为一种新的网络语言,在微博中被广泛采用,在一定程度上代表了用户的情绪和思想,也将影响微博情感倾向分析的结果。该文提出基于微博统计数据为表情符号构建情感词典的思想,通过对大量微博中与表情"共现"的文本的情感倾向分析,确定表情的情感倾向,以此构建面向情感倾向分析的表情情感词典,旨在为微博乃至其它采用表情符号的Web用户生成信息的情感倾向分析提供支持。进而,该文将表情情感词典反作用于对应的微博文本,重新度量其中情感词的倾向值,改进现有的情感词典,旨在获得更准确的情感倾向分析结果。实验表明了该方法的有效性,并分析了相关阈值的设置对结果的影响。
As a new network language,smiley has been widely used in microblog and can represent the user's mood and thought to a certain extent.Moreover,smiley will affect the result of the microblog sentiment analysis.In this paper,an idea building sentiment lexicon for the smileys is proposed based on statistics data of microblog.By analyzing sentiment orientation of texts co-occurring with smileys in microblog,the sentiment orientation of every smileys is assigned.Based on the approach,a smiley lexicon is built for sentiment analysis,so as to support the sentiment analysis of microblog and other user generation information with smileys on Web.Moreover,the smiley lexicon is fed back to their microblog texts for re-measuring orientation value of words in the texts and improving existing sentiment lexicon,so as to acquire more accurate sentiment analysis results.Experiments show the effectiveness of this method and analyze the related threshold setting on the influence of the results.