谓词是句子中的最重要的成分,它的正确与否对语义分析的影响非常大。而众多的特征直接影响到谓词标识的性能,如何组织这些特征显得尤为重要。选取了7个基本特征和30多个新特征以及它们的组合,使用最大熵分类器,在基本特征的基础上通过增加有利特征的方法,使得谓词标注的F1值增长了约5%(由84.7%增加到89.8%),词义识别的F1值增长了约2%(由80.3%增加到82.1%),结果表明,这些新特征及其组合大大提高了性能。
Predicate is the most important component in a sentence,which greatly influences the identification of the semantic analysis.The performance of predicate identification and classification relies on lots of features,but how to combine those features is more important.This paper picks out 7 basic features and over 30 new features with different combinations.By adding useful combinations of the features into the baseline system with the maximum entropy classifier,it improves by 5% of F1-score(from 84.7% up to 89.8%)on predicate identification and also gains about 2% increase of F1-score(from 80.3% up to 82.1%)on predicate classification.It shows that those new features and the combination of them can much improve the performance of the system.