汉英词典作为沟通中文与英语两种不同语言的桥梁,是中国与世界交流的工具。在信息时代飞速发展的今天,双语词典的自动构建技术在机器翻译和跨语言检索领域起着重要的作用,本文对双语词典的自动构建方法和其中的关键技术进行了比较全面的分析和总结,并提出一种从汉英平行语料库中抽取双语词语,自动构建双语词典的方法,在实现汉英句子级对齐后,对双语语料分别进行分词和词性标注处理,通过抽取汉英词语单元并计算其关联概率来实现汉英的词语对齐,最终生成双语词典。该方法在对真实语料的双语词典构建实验中取得了较好的结果,词对齐效果优于传统的IBM模型方法。
Chinese-English bilingual dictionary is a communication tool between China and the world.With the rapid development of information age,the automatic construction of bilingual dictionaries plays an important role in the area of machine translation and cross-language retrieval.The method of automatic construction of bilingual dictionary and the key technologies are comprehensive analysis and summary in this paper.It proposes a way of automatic bilingual dictionaries that terms are collected from parallel corpus.Parallel corpora are first aligned,and tagged with their part-of-speech categories respectively.Through Chinese-English word units extracting,the associated probability between every Chinese word unit and its English word unit is calculated.Eventually a bilingual dictionary is generated.A better performance is obtained in the experiments of bilingual dictionary construction on real corpora,and the result of words alignment is better than traditional IBM model method.