“把”字句是现代汉语中一种重要的特殊句式,该文尝试用基于知识库的规则方法对把字句进行语义角色自动标注。首先,我们从《人民日报》语义角色标注语料库中收集把字句例句,形成一个覆盖范围较广的把字句例句库;之后,对例句库中把字句的句法和语义构成规律进行手工标注,标注内容包括谓语动词的配价类型、把字句谓语结构类型、把字句句模类型等。在上述标注的基础上,对把字句的句模构成规律进行分析,总结出若干条语义角色标注规则;最后,在测试数据上对前述规则进行验证,语义角色标注的最终正确率为98.61%,这一结果说明该文所提出的规则在把字句语义角色标注上是有效的。
Ba-sentence is a typical Chinese sentence pattern. This paper proposed a rule-based method for automatic semantic role labeling, with a special focus on ha-sentences. Firstly, we collect a set of ba-sentences from our annotated semantic corpus, including texts from People's Daily, and thus forming a sample gallery of ba-sentences. Then, we manually annotate the valence type of each predicate, the syntactic structure type and semantic structure type of each ba-sentence. Based on this annotated corpus, we analyzed the rules of semantic formation, and summed up several rules of semantic role labeling. Finally, we evaluated these rules in a test set yielding an overall precision is 98.61%.