东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

中英可比语料库中翻译等价对抽取方法研究

ISSN号：1002-8331
期刊名称：《计算机工程与应用》
时间：0
分类：TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]中国电子信息产业发展研究院,北京100044
相关基金：国家自然科学基金（the National Natuml Science Foundation of Chinaunder Grant No.60572132）;2005国科金外资助（No.60520130297）.

关键词：可比语料库, 翻译等价对抽取, 上下文向量, 向量相似度计算, comparable corpus extraction of translation equivalents, context vector, computation of vector similarity

中文摘要：

回顾了语料库分类及可比语料库中翻译等价对抽取方法研究的历史。根据从可比语料库中提取翻译等价对所依据的基本假设：一个语言中一个词在对应到另外一种语言时其与周围词之间的共现搭配关系仍然被保持,采用双向等价对获取计算然后求交集、词加权因数TF（iw）＊IDF（i）值计算、上下文词的词性信息利用的方法来提高翻译等价对提取正确率。描述了翻译等价对抽取实验步骤,并对实验结果进行了简要分析。实验结果表明上述方法可以有效提高翻译等价对计算结果的正确率。最后提出了需要进一研究的问题。

英文摘要：

This paper reviews the classification of corpora and the history of the research on extraction of translation equivalents from comparable corpus.Based on the basic hypothesis of the extraction of translation equivalents from comparable corpus（namely, there exists a correlation between the context distribution of words which are the translation of each other）,this paper adopts the following methods to improve the accuracy of candidates of translation equivalents extracted from comparable corpus：To compute the intersection after the bidirectional extraction of translation equivalents;to calculate the word weight factor TF（iw）＊IDF（i）,and to utilize the POS information of words in the context.This paper describes the various steps in the experiment of the extraction of translation equivalents from comparable corpus,and conducts analysis on the results from the experiment.The results show that the above methods can improve the accuracy of candidates of translation equivalents extracted from comparable corpus.To round up, the paper puts forward issues required for further research.

同期刊论文项目

中英可比语料库与体育术语自动抽取的探索性研究

期刊论文 3

基于语料库和混合策略的汉英双向机器翻译方法研究

期刊论文 1

同项目期刊论文

基于混合策略的汉英双向机器翻译系统的设计

机器翻译中规则和模板的协调方法研究

期刊信息

《计算机工程与应用》
北大核心期刊（2014版）

主管单位:中国电子科技集团公司
主办单位:华北计算技术研究所
主编：怀进鹏
地址：北京市海淀区北四环中路211号北京619信箱26分箱
邮编：100083
邮箱：ceaj@vip.163.com
电话：

国际标准刊号：ISSN：1002-8331
国内统一刊号：ISSN：11-2127/TP
邮发代号:82-605

获奖情况:
1. 2012年首批获得中国学术文献评价中心发布的 “...,2. 2001年获得新闻出版署“中国期刊方阵双效期刊”,3. 2008年首批入选国家科技部“中国精品科技期刊...,4.2003年-2011年连续获得工业和信息化部期刊最高...

国内外数据库收录:
俄罗斯文摘杂志,波兰哥白尼索引,美国剑桥科学文摘,英国科学文摘数据库,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2014版）,中国北大核心期刊（2000版）

被引量:97887