语义角色标注的研究方法中使用最频繁的一类是基于特征工程,将任务转化成分类问题使用机器学习的方法来解决,几乎所有的有指导语义角色标注采用的标注语料都是宾州大学命题库标注体系。近年来,北京大学开发出一套新的标注语料—北京大学中文网库,该文的目的在于测试这类研究方法在新语料的效果,验证之前所使用的特征是否对标注语料具有依赖性。通过实验发现前人方法中的一些不足,尤其个别特征在北大网库上作用更关键。
Among all the researches on semantic role labeling(SRL),one important method which has been carried out by many researchers is to convert the task into a classification problem by selecting features,and thenapplying different kinds of classifiers.While almost all the researches based on this kind of supervised learning have been done on the same corpus-Penn Proposition Bank,here we test the same method on a new corpus—Peking University Chinese NetBank,with the goal to figure out whether the wildly used features have a strong dependence on corpus.The experiments have shown that the method and the features have good performance on the new corpus.And compared to the PropBank,some features play crucial roles in classification on the new corpus.