东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于多词块的框架元素语义核心词自动识别研究

ISSN号：1003-0077
期刊名称：中文信息学报
时间：0
页码：30-36
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]山西大学计算机与信息技术学院,山西太原030006
相关基金：基金项目：国家自然科学基金资助项目（60970053）;国家863高技术研究发展计划资助项目（2006AA01Z142）,山西省实验室开放基金资助项目（2009011059-4）;国家大学生创新性实验计划项目（081010816）
相关项目：汉语框架语义依存图自动抽取关键技术研究

关键词：计算机应用, 中文信息处理, 框架元素, 语义核心词, 多词块, computer application, Chinese information processing, Frame element , Semantic core words, Multi-wordchunk

中文摘要：

抽取一个句子的核心依存图是对句子进行语义理解的有效途径。在CFN自动标注的基础上，只能得到框架依存图，为了把框架依存图转换成框架核心依存图需要提取每个框架元素的语义核心词。该文提出了基于多词块标注的框架元素语义核心词识别和提取方法，通过对比分析，给出了多词块和框架元素的融合策略，并建立了在多词块标注基础上提取框架元素语义核心词的规则集。在6771个框架元素上的实验结果显示，采用该文的方法和规则集提取框架元素核心词的平均准确率和覆盖率分别为95．58％和82．91％。

英文摘要：

It is an effective way to understand the semantic information of a sentence by extracting the frame kernel dependency graph from the sentence. It is necessary to extract semantic core words for each frame element to further establish the frame kernel dependency graph since we can only extract the frame dependency graph from a sentence based on the automatic annotation of CFN, This paper proposes a method to identify and extract the core words of frame elements by multi-word chunk. On the basis of comparative analyzing results, we propose the strategy of in- tegrating the multi-word chunk and frame element and the rules to extract the core words of frame elements from the multi-word chunk labeling. The experimental resutts from 6 771 frame elements show that the average precision and average coverage are 95.58% and 82.91%, respectively.

同期刊论文项目