跨文本人名消歧是判断出现在不同文本的相同人名是否指称现实中相同实体的过程。跨文本人名消歧是准确获取感兴趣人物相关信息的基础,对多文本摘要、信息融合等具体应用也有重要的作用。该文运用社会网络分析法消歧中文不同文本同名歧义问题,思想是先使用谱聚类对社会网络中的人名聚类,然后根据不同社会网络边权值和不同图划分准则对人名消歧效果的影响,引入了模块度阈值作为社会网络划分的停止条件。在CLP2010的中文人名消歧数据上进行测试,显示了社会网络分析对人名消歧的有效性。
Cross-document personal name disambiguation is the process of determining if an identical name occurring in different texts refers to the same person in the real world.With the increasing need for multi-document applications,for example,multi-document summarization and information fusion,cross-document name entity disambiguation has drawn much attention.This paper employs a social network based algorithm for cross-document personal name disambiguation.This method uses the spectral clustering approach,compares the results of different graph partition criteria,and chooses the modularity threshold as the stopping measure for graph partition.Experiments datasets are built by CLP 2010 Chinese personal name disambiguation task.The results show that this method is promissing.