文摘的自动化面临诸多因难,一个重要的原因是对文摘的内容缺乏有效的自动评测方法.文中提出了基于基本要素(BE)关系网格的文摘内容连贯性评测模型.模型以BE为内容单元,以BE中的"关系"为内容单元的语法角色,通过BE关系在BE关系网格中的转移概率来表达文摘内容的连贯性.在DUC2005数据集上的评测结果显示,模型评测结果与人工评测结果的Pearson相关系数为0.408,比Lapata2005年提出的实体网格模型得到的结果提高了约66%.
One of the key problems in automatic summarization is the absence of effective autoevaluation method. A BE-Relation-Grid based evaluation model for "summary coherence" is proposed. BE(Basic Element) is viewed as the content unit and the "relation" part in BE as grammar role of the content unit. Then, the content coherence is scaled by BE relation transition probability in the BE-Relation-Grid. Experiment results on DUC2005 dataset show that the Pearson correlation coefficient of the evaluation results between this model and manual ones is 0.408, which increased by about 66% comparing with the result of entity grid model presented by Lapata in 2005.