该文提出了一种基于加权有限状态转化器(WFST)的多模型融合人名翻译框架。该框架以两个基于字符的转换模型和两个基于发音的转换模型为核心,通过加权有限状态转换器将多模型进行融合实现对人名的翻译。与单个模型相比,该文提出的方法的优势在于通过从各种信息源得到的数据价值的最大化。实验结果表明,基于多模型融合方法的人名翻译的错误率比单一模型的人名翻译的错误率降低了7.14%。
This paper proposes a novel framework for Chinese-English name back-transliteration based on multiple models by using weighted finite-state transducers (WFST). Two grapheme-based models and two phoneme-based models are kernel of this framework. Combining those models with unified framework of WFST, we can build a system for Chinese English name back transliteration. Compared with single-model systems, the advantage of this method lies in combining those information from different models and maximizing the data available. Our experiments show that the proposed framework reduces 7.14% in error rate compared with the single-model.