从《人民日报》语料库中抽取4万多个句子作为训练集和测试集,选取其主语、谓语等相关特征并根据《知网》将特征量化,然后使用支持向量机进行训练,获取判别并列复句的模型。在开放测试中获得了84%的准确率。
This paper extracted nearly 40 000 sentences as training data and testing data from People' s Daily newspaper. And then chose some main features such as subject, object and used Hownet to quantify these features. Finally trained these data with support vector machine (SVM) to get the model. The proposed method had about 84 % precision on the test data. The result of experiment indicates that this approach is feasible.