双语句子对齐在双语语料库的处理中有着非常重要的地位,是构建双语词典的第一步工作。该文利用基于带权二部图的最大权重匹配模型为生物医学文献双语摘要建模。在无双语词典的情况下,将基于长度的句子对齐方法和句子的位置信息相结合,充分利用医学文献双语摘要语料中的锚信息,将生物医学摘要段落和句子进行分类计算相似度,实现了生物医学文献双语摘要的句子对齐,取得了较好的实验结果。
Sentence alignment is an essential step in bilingual corpus processing. Sentence alignment of bilingual biomedical abstract is the first step to construct a biomedical bilingual lexicon. This paper describes a sentences alignment method using maximum weight matching on bipartite graph. After combing the sentence length and sentence location information, the anchor information is employed to calculate the paragraph similarity and sentence similarity in biomedical bilingual abstract. The good experimental results prove the effectiveness of our method.