东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

结合实体链接与实体聚类的命名实体消歧

ISSN号：1007-5321
期刊名称：《北京邮电大学学报》
时间：0
分类：TN929.53[电子电信—通信与信息系统;电子电信—信息与通信工程]
作者机构：[1]北京邮电大学智能科学与技术中心,北京100876
相关基金：国家自然科学基金项目（61273365）

关键词：命名实体消歧, 实体链接, 聚类, named entity disambiguation, entity linking, clustering

中文摘要：

为了消除文本中命名实体的歧义,提出了一种结合实体链接与实体聚类的命名实体消歧算法,结合2种方法,可弥补单独使用其中一种方法的局限.该算法在背景文本中将待消歧实体指称扩充为全称,使用扩充后的全称在英文维基百科知识库中生成候选实体集合,同时提取多种特征对候选实体集合进行排序,对于知识库中没有对应实体的指称使用聚类消歧.实验结果表明,该算法在KBP2011评测数据上的F值为0.746,在KBP2012评测数据上的F值为0.670.

英文摘要：

In order to eliminate the ambiguity of named entities in the documents, a named entity disam- biguation algorithm combining entity linking and entity clustering is proposed, and the proposed algorithm combines two methods to compensate for the limitations of only using one of the methods. The proposed algorithm expands the mentions in the background document firstly, and generates candidates in the Eng- lish Wikipedia knowledge base for expansions secondly, then extracts a variety of features to rank candi- dates, lastly uses clustering to disambiguate the mentions which has none candidates in the knowledge base. The experimental results show that, in the proposed algorithm, the F measure in KBP2011 data set is 0. 746 and the F measure in KBP2012 data set is 0. 670.

同期刊论文项目

基于儿童语言习得机制的语言接地技术研究

期刊论文 6

同项目期刊论文

双语主题跨语言伪相关反馈

面向英语文章的词性标注算法

一种用于社会化标签推荐的主题模型

Offline Urdu Nastaleeq Optical Character Recognition Based on Stacked Denoising Autoencoder

First-Feed LSTM model for video description

期刊信息

《北京邮电大学学报》
北大核心期刊（2011版）

主管单位:教育部
主办单位:北京邮电大学
主编：刘杰
地址：北京海淀区西土城路10号195信箱
邮编：100876
邮箱：byxb@bupt.edu.cn
电话：010-62281995 62282742

国际标准刊号：ISSN：1007-5321
国内统一刊号：ISSN：11-3570/TN
邮发代号:2-648

获奖情况:
美国工程信息公司（Ei）数据库收录期刊,1999年全国优秀高等学校自然科学学报及教育部优秀...,中国期刊方阵“双效”期刊

国内外数据库收录:
美国化学文摘（网络版）,荷兰文摘与引文数据库,美国工程索引,美国剑桥科学文摘,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）,中国北大核心期刊（2000版）

被引量:7684