情感分类是观点挖掘的热点研究之一。微博文本情感分类具有很高的应用价值。鉴于传统特征选择方法存在语义缺陷,采用神经网络语言模型,提出了基于概率模型的对词向量进行权重分配的深层特征表示方法,构建文本语义向量。将文本深层特征与浅层特征融合,构建融合语义信息的特征向量,弥补传统特征选择方法语义的缺陷。采用SVM层次结构分类模型,实现多种情感分类。实验结果表明,采用特征融合的层次结构情感分类方法,能有效提高微博情感分类的准确率。
Abstract: Sentiment classification is an important issue of opinion mining~ It has a high application value to classify sentiment in micro-blogs. As traditional feature selection method has semantic gap, a neural network language model was used to propose a deep feature representation method based on probability model to distribute weight to the word vector. Using this method, text semantic vector could be built. In order to avoid the semantic gap~ the deep features and shallow features of text were integrated and feature vector that contained semantic information was constructed. With SVM hierarchical classification model, a variety of sentiments could be classified. Experimental results show that the hierarchical sentiment classification method based on feature fusion can improve the accuracy of sentiment classification in micro-blogs.