该文采用基于短文本隐含空间语义特征改进文本蕴涵识别,该方法通过构造句子的隐含变量模型,并融合基于该模型的句子之间相似度特征,和词汇重叠度、N元语法重叠度、余弦相似度等字符串特征,以及带标记和未标记的子树重叠度句法特征一起利用SVM进行分类。基于该分类算法,我们对RTE-8任务进行了测试,实验表明短文本的隐含语义特征可有效改进文本蕴涵关系识别。
This paper improves the identification of textual entailment based on short text latent semantic features. The method trains a reliable latent variable model on sentences,and gets the sentence similarity features. The short text latent semantic features, combined with other string features such as word overlap, N-gram overlap, cosine simi- larity, etc, and lexical semantic features such as unlabeled sub tree overlap,labeled sub tree overlap, are used to iden- tify textual entailment using SVM. We test on RTE-8 task,and the result shows that the latent semantic features are helpful to recognize textual entailment.