WordNet是在自然语言处理领域有重要作用的英语词汇知识库,该文提出了一种将WordNet中词汇概念自动翻译为中文的方法。首先,利用电子词典和术语翻译工具将英语词汇在义项的粒度上翻译为中文;其次,将特定概念中词汇的正确义项选择看作分类问题,归纳出基于翻译唯一性、概念内和概念间翻译交集、中文短语结构规则,以及基于PMI的翻译相关性共12个特征,训练分类模型实现正确义项的选择。实验结果表明,该方法对WordNet3.0中概念翻译的覆盖率为85.21%,准确率为81.37%。
WordNet is an important English lexical semantic knowledge base. This paper presents a method for the automatic translation of the synsets in the WordNet into Chinese, named as WNCT. Firstly, WNCT uses dictionaries and term translation tools to translate the senses of English words in the WordNet into Chinese. Then WNCT regards the selection for correct sense of the words in a synset as a classification issue. The classification model is then trained by 12 features extracted according to the uniqueness of translation, the translation intersections within and between the concepts, the construction rules for Chinese phrase as well as PMI based translation relevance. Ex perimental results show that WNCT achieve 85.21% coverage rate and 81.37 % accuracy for the Chinese translation of the synsets in WordNet 3.0.