现有的句子相似度计算方法仅考虑句子的依存关系或者组成句子的词性、词序、词义等信息,没有考虑到整个句子的语义信息,文章提出了一种基于语义扩展的句子相似度计算方法,解决了句子相似度计算时忽略句子语义的问题。利用搜索引擎对句子语义扩展,从而将简短的句子转化为长文本,然后使用主题模型对长文本进行特征提取,即将句子的相似度计算转化为求两个句子的语义间的差别运算。实验结果表明,基于语义扩展的句子相似度计算准确率能达到87%,而且计算结果符合常识判断。
Current sentence similarity computation algorithm only considered the part of speech,word order,semantic information,and did not consider the semantics implied by the sentence.Therefore,this paper presents a sentence similarity computing based on semantic extension which can solve the problem of ignoring the sentence semantic information.The search engine to extend short sentence knowledge is used,which can transform short sentence into a long text.And then the topic model to find real sentence's meaning is applied.As a result,the procedure of computing sentence similarity is transformed into the process of calculating the differences between the real semantics of two sentences.The experiment results show that the accuracy rate of sentence similarity computing based on semantic deep extension rises to 87%,what's more,calculation results are measured up to common sense judgments.