东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于多模型融合的人名翻译系统

ISSN号：1003-0077
期刊名称：《中文信息学报》
时间：0
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]中国科学院自动化研究所数字内容技术研究中心,北京100190, [2]中国科学院自动化研究所模式识别国家重点实验室,北京100190
相关基金：国家863计划资助项目（2006AA01Z194）

关键词：计算机应用, 中文信息处理, 多模型融合, 音译, 命名实体, 加权有限状态转换器, computer application, Chinese information processing, multiple model combination, transliteration, named entity, WFST

中文摘要：

该文提出了一种基于加权有限状态转化器（WFST）的多模型融合人名翻译框架。该框架以两个基于字符的转换模型和两个基于发音的转换模型为核心，通过加权有限状态转换器将多模型进行融合实现对人名的翻译。与单个模型相比，该文提出的方法的优势在于通过从各种信息源得到的数据价值的最大化。实验结果表明，基于多模型融合方法的人名翻译的错误率比单一模型的人名翻译的错误率降低了7．14％。

英文摘要：

This paper proposes a novel framework for Chinese-English name back-transliteration based on multiple models by using weighted finite-state transducers （WFST）. Two grapheme-based models and two phoneme-based models are kernel of this framework. Combining those models with unified framework of WFST, we can build a system for Chinese English name back transliteration. Compared with single-model systems, the advantage of this method lies in combining those information from different models and maximizing the data available. Our experiments show that the proposed framework reduces 7.14% in error rate compared with the single-model.

同期刊论文项目