东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

汉语框架自动识别中的歧义消解

期刊名称：中文信息学报
时间：0
页码：38-44
语言：中文
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]山西大学计算中心,山西太原030006, [2]山西大学数学科学学院,山西太原030006, [3]太原工业学院,山西太原030008
相关基金：国家自然科学基金资助项目（60873128）; 山西省高校高新技术产业化资助项目（20090003）.致谢实验过程中使用了山西大学FC2000分词软件、清华大学周强教授提供的汉语基本块自动标注器、Stanford大学的句法分析器（v1.6）、哈尔滨工业大学信息检索研究中心语言技术平台LTP,Mate依存句法分析器,在此表示谢意!
相关项目：汉语框架语义角色自动标注技术研究

关键词：汉语框架语义知识库, 框架语义, 框架消歧, 最大熵模型, Chinese FrameNet, frame semantics, frame disambiguation, maximum entropy

中文摘要：

该文研究了汉语框架自动识别中的歧义消解问题,即对给定句子中的目标词,基于其上下文环境,从现有的框架库中,为该目标词自动标注一个合适的框架。该文将此任务看作分类问题,使用最大熵建模,选用词、词性、基本块、依存句法树上的若干特征,并使用开窗口技术和BOW策略,以目前汉语框架语义知识库中的88个词元的2 077条例句为训练、测试语料,进行了3-fold交叉验证实验,最好结果取得69.28%的精确率（Accuracy）。

英文摘要：

This paper address the issue of disambiguation during automatic identification of Chinese frame,i.e.to assign an appropriate frame that is presented in current CFN（Chinese FrameNet） to the given target word within the sentence.This frame disambiguation task is treated as a problem of frame classification based on the context,using a maximum entropy model.The selected features in this paper include BOW（bag-of-word）,the current word,part of speech,the basic chuck information,and the label in dependency syntax tree,as well as the technique of optional sizes of slide window.The training and testing sets contain 2077 annotated sentences with 88 lexical unitsfrom current Chinese FrameNet.The best result achieves an accuracy of 69.28% in the 3-fold cross-validation experments.

同期刊论文项目