基于核函数的蛋白质关系(PPI)抽取可以捕获结构化信息,取得了较高的性能,但其计算复杂度过高。该文结合词汇、句法等信息,重点探讨了依存信息对基于特征向量的蛋白质关系(PPI)抽取的影响。在多个PPI语料库上的实验表明,依存信息和基本短语块信息可以有效提高基于特征向量的PPI抽取性能。特别要指出,在AIMed语料上的PPI抽取取得了54.7的F测度,是目前基于特征向量的PPI抽取系统的最好水平。
Kernel-based PPI(Protein-Protein Interaction) extraction systems can achieve better performance because of their capability to capture structural information,but at the expense of high computational complexity.This paper investigates the combination of diverse lexical,syntactic and particularly dependency information for feature-based protein-protein interaction extraction using SVMs.Experimental evaluation on multiple PPI corpora reveals that dependency information as well as base phrase chunking information is very effective for feature-based PPI extraction.Particularly,our method achieves a promising performance of 54.7 in F-measure on the AIMed corpus,surpassing other state-of-the-art feature-based ones.