鉴于词语表达形式与词语语义的多样性,词语语义相似度计算是自然语言处理、智能检索、文档聚类等领域的一个研究热点。文中根据词语表达方式的特点,在基于词语语义词典和基于大规模语料库这两种计算词语语义相似度方法的基础之上,提出一种改进的主观和客观相结合的词语相似度计算方法。从方法论的角度,本算法既融合了主观经验主义思想也融合了客观的理性主义思想,使得词语语义相似度的计算结果能够更加准确。实验结果表明采用文方法是有效的,能够显著提高词语语义相似度计算结果的准确性。
In view of the diversity of word expression form and word semantics, the word semantic calculation is a hot research topic in the fields of natural language processing ,intelligent search, document clustering and so on. According to the features of word expression, based on the two methods which is based on word semantic dictionary and the other is based on large-scale corpus to calculate word semanteme, an improved method combining subjective and objective methods to calculate word semantic similarity is proposed. From the point of view of the methodology, the method has combined both subjective experience and objective rationality, making it possible to improve the accuracy of the word semantic similarity. Experimental results show that the proposed method is effective and can significantly improve the accuracy of the word semantic similarity.