由于语义角色标注对深层次的自然语言处理非常必要,提出了一种基于条件随机场的语义角色标注方法.该方法以浅层句法分析为基础,把短语或命名实体作为标注的基本单元,将条件随机场模型用于句子中谓词的语义角色标注.该方法的关键在于模型的参数估计和特征选择.具体应用中采用L-BFGS算法学习模型参数,并选择基于句法成分的、基于谓词的、句法成分-谓词关系三类特征作为模型特征集.在CoNLL-2005评测任务所提供的数据集上的实验结果表明:基于条件随机场的方法比基于最大熵模型的方法性能更好.该方法在语义角色标注任务上获得了80.43%的准确率和63.55%的召回率.
Due to the fact that semantic role labeling (SRL) is very necessary for deep natural language processing, a method based on conditional random fields (CRFs) is proposed for the SRL task. This method takes shallow syntactic parsing as the foundation, phrases or named entities as the labeled units, and the CRFs model is trained to label the predicates' semantic roles in a sentence. The key of the method is parameter estimation and feature selection for the CRFs model. The L-BFGS algorithm was employed for parameter estimation, and three category features: features based on sentence constituents, features based on predicate, and predicate-constituent features as a set of features for the model were selected. Evaluation on the datasets of CoNLL-2005 SRL shared task shows that the method can obtain better performance than the maximum entropy model, and can achieve 80. 43 % precision and 63. 55 % recall for semantic role labeling.