跨文本命名实体同指是指出现在多个文本中的相同名字指称相同对象.同指消解则是判断相同的名字是否指称相同对象的过程.跨文本同指消解对于多文本摘要和信息融合等具有重要作用.针对中文中最典型的命名实体——人名,研究了使用层次聚类方法在进行跨文本同指消解中的2个重要问题:特征选择和聚类停止条件判断.
Cross-document named entity coreference resolution is the process of determining if an identical name occurring in different texts refers to the same object. With the increasing need for multi-document applications,for example,multi-document summarization and information fusion,cross-document name entity coreference resolution has drawn much attention. The paper focuses on multi-document personal coreference resolution,and realizes an agglomerative clustering approach for personal coreference resolution,in which feature selection and stopping measures of the clustering to estimate the number of entities are discussed in detail.