提出了基于最大熵模型的语义角色标注方法,该方法以浅层句法分析为基础,把短语或命名实体作为标注的基本单元,将最大熵模型用于句子中谓词的语义角色标注.该方法的关键在于模型参数估计和特征选择.具体应用中采用IIS算法学习模型参数,并选择基于句法成分的、基于谓词的、句法成分-谓词关系、语义四类特征作为模型特征集.将该方法用于信息抽取中事件表述语句的语义角色标注,对"职务变动"和"会见"两类事件的表述语句进行事件要素的语义角色标注,在各自的测试集上分别获得了76.3%和72.2%的综合指标F值.
A method based on maximum entropy model is proposed for Semantic Role Labeling (SRL). This method takes shallow syntactic parsing as base, and takes phrase or named entity as the labeled units, and maximum entropy model is trained to label the predicates’ semantic roles in a sentence. The key of the method is parameter estimation and feature selection for maximum entropy model. In this paper, the IIS algorithm was employed for parameter estimation, and four categories features: features based on sentence constituents, features based on predicate, predicate-constituent features and semantic features as features set of the model were selected. The method is used to label semantic roles in an event mention sentence for information extraction. We got F=76.3% and F=72.2% results on different test set for "management succession" and "meeting".