短语等价对在词典编纂、机器翻译和跨语言信息检索中有着广泛的应用.提出了一种新的短语对齐方法,使用可信度较高的词典对齐结果来抽取源语言短语的译文中心语块,依据译文扩展可信度来确定源语言短语的译文统计边界.从译文中心语块出发,结合译文统计边界生成源语言短语的所有候选译文.对候选译文进行评价,从中选出最可靠的译文.同时利用贪心算法消除源语言短语译文边界之间的交叉冲突.实验结果表明,所提出的方法在开放测试中其正确率达到了82.76%,性能好于其他方法.
Phrase equivalence pair is very useful for bilingual lexicography, machine translation and crossing-language information retrieval. In this paper, a new method of phrase alignment is proposed, where translation head-phrase is obtained according to dictionary-based word alignment which is very reliable, and statistical translation boundary is determined based on the translation extending reliability. All candidate translations of source language phrase are extracted by combining translation head-phrase with statistical translation boundary. A linear combination model is applied to evaluate all candidate translations of source language phrase and the most probable one is selected. At the same time, a greedy algorithm is used to eliminate the crossing-conflicts between translation boundaries of source language phrases. Experimental results show that the new method achieves 82.76% at precision, which is better than other approaches in open test.