基于最大熵的括号转录语法模型具有翻译能力强、模型训练简单的优点,成为近些年统计机器翻译研究的热点.然而,该模型存在短语调序实例样本分布不平衡的缺点.针对该问题,该文提出了一种引入集成学习的短语调序模型训练方法.在大规模数据集上的实验结果表明,我们的方法能有效改善调序模型的训练效果,显著提高翻译系统性能.
The Maximum Entropy Based BTG model becomes a hot topic in statistical machine translation in recent years due to its strong translation and easy to-train abilities. However, the distribution of reordering examples in this model is imbalanced. To solve this problem, we introduce an ensemble learning method for training phrase reor- dering model. Experimental results show that,the reordering model can reach a better training effect via our method and the performance of the translation system is improved significantly in a large-scale dataset.