东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于中文维基百科的命名实体消歧方法

ISSN号：1003-3513
期刊名称：《数据分析与知识发现》
时间：0
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]杭州电子科技大学计算机学院,浙江杭州310018
相关基金：国家自然科学基金资助项目（61103101）; 教育部人文社会科学研究基金资助项目（12YJCZH201）

关键词：命名实体消歧, 词义消歧, 中文维基百科, 中文信息处理, named entity disambiguation, word sense disambiguation, Chinese Wikipedia, Chinese information processing

中文摘要：

该文提出一种面向中文命名实体的统计消歧方法。该方法采用中文维基百科作为世界知识,同时以待消歧命名实体在维基百科中的消歧页包含的词义选项为候选的命名实体概念,在充分利用维基百科页面信息和链接信息,以及命名实体上下文信息的基础上,实现中文命名实体的消歧。在一个小规模测试集上进行了实验,并获得87.5%的准确率,表明提出的方法具有可行性和有效性。

英文摘要：

In this paper,a statistical model is proposed for Chinese named entity disambiguation by making use of rich links information and various types of page information in Chinese Wikipedia.The text and Wikipedia features are combined effectively in this model by different means.At the same time,the word sense options contained in Wikipedia disambiguation pages of the related named entity are considered as candidate named entities.In the experiment,the accuracy 87.5% can be obtained on a small test set.The experimental results show that the proposed method is feasible and effective.

同期刊论文项目