翻译等价对在词典编纂、机器翻译和跨语言信息检索中有着广泛的应用。文章从双语句对的译文等价树中抽取翻译等价对。使用译文直译率、短语对齐概率和目标语-源语言短语长度差异等特征对自动获取的等价对进行评价。提出了一种基于多重线性回归模型的等价时评价方法,并结合N—Best策略时候选翻译等价对进行过滤。实验结果表明:在开放测试中,基于多重线性回归模型的等价对评价压过滤方法其性能要优于其它方法。
Translation equivalence is very useful for bilingual lexicography,machine translation system and cross-lingual information retrieval.In this paper,translation equivalences are extracted from translation corresponding trees of bilingual sentence pairs.Translation literality,phrase alignment probability,and length difference from target language phrase to source language phrase are employed to score for ex.tracted equivalences.An evaluation method based on multiple linear regression is proposed.This new approach is employed to filter equivalences combined with N-Best strategy.Experimental results show that the new method does better than other approaches on evaluation and filtering.