该文研究了汉语框架自动识别中的歧义消解问题,即对给定句子中的目标词,基于其上下文环境,从现有的框架库中,为该目标词自动标注一个合适的框架。该文将此任务看作分类问题,使用最大熵建模,选用词、词性、基本块、依存句法树上的若干特征,并使用开窗口技术和BOW策略,以目前汉语框架语义知识库中的88个词元的2 077条例句为训练、测试语料,进行了3-fold交叉验证实验,最好结果取得69.28%的精确率(Accuracy)。
This paper address the issue of disambiguation during automatic identification of Chinese frame,i.e.to assign an appropriate frame that is presented in current CFN(Chinese FrameNet) to the given target word within the sentence.This frame disambiguation task is treated as a problem of frame classification based on the context,using a maximum entropy model.The selected features in this paper include BOW(bag-of-word),the current word,part of speech,the basic chuck information,and the label in dependency syntax tree,as well as the technique of optional sizes of slide window.The training and testing sets contain 2077 annotated sentences with 88 lexical unitsfrom current Chinese FrameNet.The best result achieves an accuracy of 69.28% in the 3-fold cross-validation experments.