在新浪微博发表的内容没有固定主题于领域、用词网络化、包含大量的情感符号以及口语表达,这些与传统文本有很大区别,因此传统的情感词典并不适用于社交网络短文本环境,对此提出一种基于社交网络中特殊情感符号的跨媒体多情绪(喜怒哀乐)情感词典构建方法。将图片与短文本内容相结合,通过表情符与文本词之间的互信息的计算,筛选基于社交网络的情绪词典。将该词典与已有的情感词典进行对比,对比结果表明,在社交网络环境中,该词典覆盖率达到84.23%,对于4种不同情绪的分类准确率达到73.71%,明显优于已有词典。
The contexts in Sina microblog usually has no certain themes,the words are always not written language which contain emotions and colloquial expression.Hence the traditional sentiment lexicon is not suitable for the sentiment analysis of the texts from online social network.To solve the prolems,a method to generate a four classification sentiment lexicon(happy,love,angry,sad)was proposed using the emotions in the social network.Compared the proposed sentiment lexicon with the others,it has better performance on analyzing the sentiment short text in social network.The coverage is 84.23% and the accuracy is 73.71%.