文本蕴含关系是广泛分布于自然语言文本中的单向推理关系,文本蕴含相关研究是自然语言处理领域的一项基础性研究,它可以辅助其他自然语言处理任务的进行,并且具有丰富的应用场景.文中首先界定了文本蕴含研究的范畴.作为一种二元关系,文本蕴含有3个基本研究任务——关系识别、知识获取和蕴含对生成.其中,关系识别有两个核心问题——语义表示与推理机制;知识获取也有两个核心问题——知识表示与知识来源;蕴含对生成研究进展比较缓慢,文中细致地分析了其内因和外因.文中围绕语义表示与推理机制这两个核心问题梳理了关系识别的研究进展,围绕知识表示与知识来源梳理了知识获取的研究进展,并指出了各类方法的可取之处与不足之处.文本蕴含研究的进展离不开相关国际评测,文中也对这些国际评测和数据集进行了归纳总结.大数据时代的到来和深度学习理论的不断发展,为文本蕴含相关研究提供了丰富的知识来源和有力的研究工具,同时也带来了许多崭新的研究课题.文中立足当前研究形势,展望了未来研究方向,并从理论上探讨了其可行性.
Textual entailment,as a directional semantic reasoning relation,is widely distributed in natural language texts.Research on textual entailment is a fundamental study in the field of natural language processing.With various applications,it is helpful to other natural language processing tasks.This paper clarifies the scope of textual entailment at first.As a binary relationship,textual entailment has three basic research tasks,that is,recognizing textual entailment,knowledge acquisition and generating entailment pairs.There are two key problems in recognizing textual entailment,that is,semantic representation and reasoning mechanism.There are also two key problems in knowledge acquisition,that is,knowledge representation and knowledge source.This paper makes a detailed analysis on the internal and external factors leading to the slow process of research on generating entailment pairs.This paper focuses on these key problems while expounding methods of recognizing textual entailment and knowledge acquisition.This paper points out the pros and cons of each method then.The development of research on textual entailment is inseparable with international evaluation exercises.This paper summarizes the datasets from these evaluation exercises.The arrival of the big data era and the developmentof deep learning theory bring a new rich source of knowledge and powerful tools,as well as novel research topics.The future research directions are pointed out and their feasibility is also discussed under the current research situation.