为解决Web数据集成中大量事件表象语句共指现实世界同一事件,导致数据冗余问题,提出一种基于Markov逻辑网的事件表象统一方法。该方法从共指事件表象集合中获得较准确详细的一条表象,作为统一的事件表象对应现实事件,为数据集成提供高质量数据;将事件表象使用八个维度的形式表示,训练Markov逻辑网从共指事件表象集合中推理出准确详细的维度内容,重新组合后形成一条事件表象。使用少量一阶谓词从维度内容、事件表象和数据源等多角度制定相应规则,通过推理解决数据不一致、不完整、不详细问题。实验结果表明基于Markov逻辑网的事件表象统一方法能获得较准确详细的统一事件表象。
In order to solve the problem that a number of co-reference event mentions pointed to one real world happened event and lead to duplicated data in Web data integration, this paper proposed an event mention unification approach based on Markov logic networks. This approach obtained a unified event mention from the co-reference event mentions set. The unified event mention was accurate, detailed and it could point to the real event. This paper used eight dimensions to express an event mention, trained Markov logic networks to choose the accurate and detailed dimensions and reset these dimensions to combine a unified event mention. It used a small number of first-order predicates from different aspects such as dimension contents, event mentions and data source to make some appropriate formulas. The proposed approach resolved the problems of inconsistent data, incomplete data and imprecise data in co-reference event mentions. The experimental results show that the event mention unification approach based on Markov logic networks can obtain accurate and detailed unified event mention.