采用支持向量机的机器学习方法,以中文宾州树库为基础,对中文文本进行了部分语义角色标注实验。选取了主语、宾语、间接宾语、时间和地点这五种主要的语义角色,以中文PropBank 5.0中的前1 652个句子作为实验的训练集和测试集,选择路径、短语类型、谓词、头词、头词词性等八个属性作为分类特征,采用两阶段分类方法,在测试集上得到的总体语义角色标注的准确率和召回率分别为89.73%和91.26%。实验结果表明该方法对中文浅层语义分析工作是有效的。
This paper presented an experiment on semantic role labeling by using SVM. This experiment was based on Chinese PropBank 5.0, which consisted of 1 652 sentences. The role-labeling set of this experiment included subject, object, !ndirect object, time and location. It used two-phase classification method with eight features, including path, phrase type, etc. For the small scaled training set, the experiment on testing set could reach the accuracy of 89.73% and the recall of 91.26% for semantic role labeling. Results highlight the effectiveness and efficiency of proposed approach for shallow semantic parsing of Chinese.