基于短语的统计翻译模型是目前机器翻译领域广泛使用的模型之一。但是,由于在解码时采用短语精确匹配的策略,造成了严重的数据稀疏问题,短语表中的大量短语无法得到充分利用。为此,该文提出了人机互助的交互式翻译方法。对于翻译短语表中找不到的短语,首先通过模糊匹配的方法,在短语表中寻找与其相似的短语。然后利用组合分类器,判断哪些相似短语可能提高句子的翻译质量。最后,通过人机交互的方法,选择可能提高翻译质量且保持原句语义的短语。在口语语料上的实验结果证明,这种方法可以有效地提高翻译系统的译文质量。
The phrase-based statistical machine translation model is widely studied and applied in the circle of machine translation research. However, the model uses the strategy of precise matching in decoding, which suffers severely from the data sparseness problem, leaving most phrases in phrase table under-exploited in translation process. Therefore we propose a novel interactive approach to translation based on human-machine cooperation. For an unknown phrase, the system finds its similar phrases in the phrase table through fuzzy matching. Then a classifier is combined to judge phrases capable of improving the translation quality. At last, the phrase which has the same meaning with the unknown phrase is decoded through human-machine interaction. The experimental results on spoken language corpus show that this approach significantly improves the translation quality.