针对中文微博中的海量文本,提出了利用领域观点词词典和支持向量机的方法对中文微博中的观点句进行识别。构建领域观点词词典,统计了表示中文微博观点句的5个特征,选取特征1,2,3,4进行观点句识别,并将基于支持向量机的3种不同特征组合识别算法与基于领域观点词词典的识别算法进行对比。算法对比结果表明,基于支持向量机的算法对微博观点句的识别效果较好,准确率68.75%,召回率48.71%,F值57.02%。
For the mass texts in the micro-blog, uses the dictionary of opinion words and the method of support vector machine (SVM) to recognize the opinion sentence in Chinese micro-biog. Constructs the dictionary of opinion words, counts five features of Chinese micro-blog opinion sentences, selects four features to recognize opinion sentences, as well as compares the SVM-based algorithm and the algorithm of opinion words dictionary. The contrast results show that the SVM-based method is best in identifying the micro-blog opinion sentence, and the accuracy is 68.75 %, the recall rate is 48.71% and the F-measure is 57.02 %.