情感分析已经成为当今自然语言处理领域的热点问题。对于文本的自动化、半监督式的情感分析研究具有广泛的理论和实用价值。基于情感词典的情感倾向分析方法是文本情感分析的一种重要手段。然而,中文词汇在不同领域中的情感倾向不尽相同,一词多义现象明显。同时,不同领域中的情感词也具有专业性、领域性的特点。针对这些问题,本文提出一种基于词向量相似度的半监督情感极性判断算法(Sentiment orientation from word vector,SO-WV),并依据该算法设计出一种跨领域的中文情感词典构建方法。实验证明,本文所设计的情感词典构建方法能有效地对情感词情感倾向进行判断。算法不仅在不同领域的情感词典建立上具有良好的可移植性,同时还具有专业性、领域性的特点。
Nowadays, sentiment analysis has become a hot research topic in the natural language process- ing field. The automated and semi-supervised way of text sentiment analysis makes a high value on prac- ticing and theory studies. The sentiment orientation algorithm based on sentiment lexicon is an important approach in text sentiment analysis. Constructing a sentiment lexicon effectively is a basic task in the text sentiment analysis. However, Chinese words are very ambiguous in different domains. Meanwhile, dif- ferent areas of sentiment words also have the characteristic of specialized. To solve these problems, we propose a semi-supervised sentiment orientation classification algorithm based on word vector similarity (SO-WV). Experiments show that, the algorithm can classify the sentiment orientation of words effec- tively. This algorithm has the versatility in different areas, and also offers professional and specialized characteristics.