该文针对中文共指消解的具体任务,提出采用谱聚类的方法进行共指消解。首先,在待消解项对上抽取特征,使用最大熵模型判断两个待消解项存在共指关系的概率;然后,以此概率值作为相似度进行谱聚类;最后,得到若干实体,实现共指消解。该方法能从全局的角度进行实体划分,有效地提高准确率。在ACE 2007标准数据集上的Diagnostic实验结果表明该方法的ACE Value比baseline方法有了2.5%的提高,Unweighted Precision值有5.4%的提高。
This paper presents a novel method to implement coreference resolution. This method is based on spectral clustering. A maximum entropy model is first used to get the coreference probability of mention pairs with extracted features. The probabilities of mention pairs are then used to construct the similarity matrix for spectral clustering. Entities are generated according to the clustering cuts. This method can divide entities with a global view, which effectively improves precision. Experiments on ACE 2007 dataset show that the ACE Value of this method is 2.5% higher than that of baseline on Diagnostic task, and 5.4% higher in Unweighted Precision.