随着网络的发展,短文本语言计算的研究方兴未艾,且语义相似度在人工智能、认知学、语义学、心理学和生物学等领域里占有重要位置。在已有的传统的相似度研究算法上,为了能更快更准确地计算出相似度,文中通过构建概念树,设法把短文本集中到某个特定的领域。因概念树、概念词典既能表现概念之间的语义关系,又能表现概念层次结构,故而更能大大提高检索效率。在此基础上的相似度计算也使得检索结果更加准确,进而方便研究短文本之间的相似性与唯一性,大大增加了后期对挖掘的正确性。
With the development of the network, short texts have attracted numerous researchers' attention, semantic similarity occupies an important positions in artificial intelligence, cognitive linguistics, semantics, psychology and biology. It is different from traditional essays on the research of semantic similarity, which tries to put the short text focus on some special area by building the concept tree. It' s concept tree that shows the relationship and hierarchical structure between concepts, which more greatly improve the efficiency of searching, so as the concepts dictionary. On the basis of the similarity calculation makes the retrieval results more accurate, so it' s more convenient to study the similarity and the uniqueness in short texts and the late mining.