篇章标注是自然语言处理中的重要任务,很多其他任务,如自动摘要、机器问答等都可以通过篇章标注得到对文本内容和语义的认识,从而获得更好的结果。与此同时,篇章理解的理论如篇章修辞结构(RST),向心理论(CT)等与实际问题的结合并不紧密,难以实用。该文中我们参考现有的语言学理论和一些语篇标注库(如RSTDT,PDTB),并结合自然语言处理任务特点,提出了一套用于篇章标注的汉语标注体系。这个体系能够比较准确和全面地描述出篇章的内容和逻辑关系,并很好地服务于实际任务的需要。
Discourse Tagging is fundamental in natural language processing and helpful to a deep understanding of the texts. Many application tasks, such as automatic summarization, question & answering and so on, would benefit a lot from a thorough understanding of the text. On the basis of the existing discourse theories such as Rhetoric Struc- ture Theory or Centering Theory, this paper designs a new discourse tagging system, which covers both the logical relations and text content or the practical needs of real natural language processing tasks.