篇章关系分为显式和隐式两种.显式关系的显著特征是篇章的基本单元之间存在显式连接词.针对汉语显式篇章关系,构建了包括汉语连接词识别和篇章关系分类的显式篇章关系分析平台.该文选取汉语宾州树库(Chinese Penn Treebank,CTB)中的500篇文本进行了汉语显式篇章关系标注;结合连接词的中心词,采用最大熵分类器构建了汉语连接词识别模块,其性能F1值达到了66.79%;基于连接词及其词性等上下文特征,构建了篇章关系分类器,其在最顶层4大类语义关系上的分类性能的F1值为91.92%.
Discourse relations can be expressed explicitly or implicitly. This paper focuses on explicit discourse rela- tions that are explicitly signaled by discourse connectives. We propose an explicit discourse relation parsing plat- form, containing connective identification and sense classification. Using 500 texts from the Chinese Discourse Tree- Bank corpus (CTB), we annotate an explicit discourse relations corpus. Considering headwords of connectives, we construct a connective identifier using maximum entropy based on this corpus, which reports F1 of 66.79%. And a sense classifier based on the context of connective itself is proposed and reports F1 of 91.92%.