通过识别海量中文微博文本观点句的情感倾向,能挖掘用户对某事件或产品持有的个人立场。为了找到更适合观点句倾向性识别的特征与模型,本文在分析微博观点句特征基础上,使用句式特征、句内特征以及隐性特征三类特征,借助于SVM模型对微博进行主客观识别;然后以主观句作为语料,从情感特征、词性特征、句式特征与句问特征四个角度来表示微博,最后利用SVM模型进行观点句的褒义、贬义、褒义贬义混合的情感倾向性分析。该方法在COAE2015 Task2“微博观点句识别”评测结果中取得较好的效果,微平均评估上,准确率达到了74.01%,召回率达到了71.61%,F值为72.79%,综合排名第二,测评结果验证了本文提出的方法有效且具有可行性。
Through recognizing the sentiment orientation of massive Chinese micro-blogging perspective sentences, we mined the personal position towards an event or a product held by users. In order to find better characteristics and model to recognize sentence sentiment, the paper analyzed the characteristics of micro blog sentences. The sentence structure feature, inner-sentence feature and latent feature were chosen to represent sentences, and the SVM model was used to classify subjective and objective sentences. Regard the subjective sentences as corpus, we selected the emotion feature, the part of speech, the sentence feature and the feature between sentences and adopted SVM model to analyze the sentiment orientation of perspective sentences, the orientations may include positive orientation, negative orientation and mix orientation. The COAE2015 reviews experimental results analysis indicated that the accuracy rate reaches 74.01% , recall rate reaches 71.61% and F value is 72.79% in the micro average evaluation. Moreover, this result ranks the second in an integrated consideration. Thus, the experimental results verified that the proposed approach is effective and feasible.