统计机器翻译对时间、数字、量词的泛化能力较弱,为了提高汉维机器翻译系统对时间、数字和量词短语的翻译性能,该文利用双语语料库挖掘并提取汉语时间、数字、量词表达与翻译模式,实现了基于模板的时间、数字、无歧义量词翻译方法及基于上下文的有歧义量词翻译方法。时间、数字、无歧义量词、有歧义量词的翻译F值达到了93.23%、90.15%、96.55%、87.58%,实验证明,该方法具有简单高效的优点。
The Chinese-Uyghur statistical machine translation system for times,numerals and quantifiers generalization ability are relatively weak.This paper uses a corpus approach to mine and extract the Chinese times,numerals and quantifier,realizing context based ambiguous quantifier translation.Experimental results show that the proposed method achieves 93.23%,90.15%,96.55%,and 87.58%in F-measure for the translation of times,numerals,unambiguous quantifiers and ambiguous quantifiers.