位置:成果数据库 > 期刊 > 期刊详情页
动词次范畴英汉论元对应关系获取
  • 期刊名称:中文信息学报,2010,24(2):91-95(EI收录)
  • 时间:0
  • 分类:TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]教育部微软语言语音重点实验室,哈尔滨工业大学,黑龙江哈尔滨150001, [2]计算机科学与技术学院,黑龙江大学,黑龙江哈尔滨150001
  • 相关基金:国家自然科学基金资助项目(60773069,60973169)
  • 相关项目:英汉动词次范畴化对应关系自动获取研究
中文摘要:

动词次范畴是根据句法行为对动词的进一步划分,它是由核心动词和一系列论元组成。其相关研究在英汉等多种语言方面都取得了较好的成果,但跨语言之间的研究还很少。该文提出了一种基于主动学习策略的英汉动词次范畴论元对应关系自动获取方法,这种方法可以在双语平行语料上,几乎不需要任何先验的语言学知识的情况下,自动获取英汉论元的对应关系。然后我们将这些对应关系加入了统计机器翻译系统。实验结果表明,融合了英汉动词次范畴论元对应关系的SMT系统在性能上有明显的提升,证明了自动抽取的对应关系的有效性,也为SMT提供了新的研究方向。

英文摘要:

The verb subcategorization (SCF) is a more brief classification based on syntactic behaviors of verb and it is composed by a verb and several arguments. Recently it has attracted substantial researches for a single language, e.g. English and Chinese, whereas the cross-lingual subcategorization demands more systematic efforts. We present a novel method to obtain SCF argument crrespondenee between Chinese and English based on active learning. This method can find the new relations through bilingual parallel sentence pairs almost without any priori language knowl- edge. We also integrated these relations to the statistical machine translation (SMT) system and experiment results show that the performance of SMT combined bilingual argument relationships has significant improvement, which indicates the validity of argument corresponding relationships automatically obtained.

同期刊论文项目
同项目期刊论文