该文提出一种面向中文命名实体的统计消歧方法。该方法采用中文维基百科作为世界知识,同时以待消歧命名实体在维基百科中的消歧页包含的词义选项为候选的命名实体概念,在充分利用维基百科页面信息和链接信息,以及命名实体上下文信息的基础上,实现中文命名实体的消歧。在一个小规模测试集上进行了实验,并获得87.5%的准确率,表明提出的方法具有可行性和有效性。
In this paper,a statistical model is proposed for Chinese named entity disambiguation by making use of rich links information and various types of page information in Chinese Wikipedia.The text and Wikipedia features are combined effectively in this model by different means.At the same time,the word sense options contained in Wikipedia disambiguation pages of the related named entity are considered as candidate named entities.In the experiment,the accuracy 87.5% can be obtained on a small test set.The experimental results show that the proposed method is feasible and effective.