AMR(抽象语义表示)是国际上一种新的句子语义表示方法,有着接近于中间语言的表示能力,其研发者已经建立了英文《小王子》等AMR语料库。AMR与以往的句法语义表示方法的最大不同在于两个方面,首先采用图结构来表示句子的语义;其次允许添加原句之外的概念节点来表示隐含的语义。该文针对汉语特点,在制定中文AMR标注规范的基础上,标注完成了中文版《小王子》的AMR语料库,标注一致性的Smatch值为0.83。统计结果显示,英汉双语含图结构句子具有很高的相关性,且含有图的句子比例高达40%左右,额外添加的概念节点则存在较大差异。最后讨论了AMR在汉语句子语义表示以及跨语言对比方面的优势。
AMR is a new representation of the abstract meaning of a sentence, which is close to the Interlingua. The English AMR corpus including the Little Prince has been released. The major differences between AMR and the previous syntactic and semantic representation lie in two aspects. First, AMR uses a graph. Second, it allows adding concept nodes which are omitted in a sentence. In this paper, we design the Chinese AMR annotation specification and construct the Chinese Little Prince AMR corpus, achieving an inter-agreement Smatch value is 0.83. The bilingual comparison shows that the graph structures in English and Chinese sentences are highly correlated. With a proportion of 40% sentences having graph structure. But the added concept nodes are different. We also discuss AMR's ability to represent the semantic meaning of Chinese sentences as well as the advantages of AMR in cross language comparison.