近年来,基于基因本体比较基因之间的功能相似度成为一个研究热点。当前,基因功能相似度计算方法可以分为2种类型:逐对(pair-wise)比较法和成组(group-wise)比较法。然而,由于基因本体注释数据的丰度问题,造成大量的基因具有相同的本体注释数据,从而导致基因功能相似度计算方法的结果存在偏差。本文提出一种改进的基因功能相似度计算方法,对注释集合的语义信息量进行归一化,达到准确度量基因之间的功能相似度的目的。实验结果表明:本文提出的方法可以消除相同注释对基因功能相似度计算方法的影响,且在测试平台上获得非常优秀的结果。
In recent years,comparing the functional similarity of genes based on Gene Ontology has become a research hotspot. Currently,gene functional similarity calculation methods can be mainly divided into two types: pair-wise approaches and group-wise approaches. However,due to the abundance of annotation data of genes,large number of genes has the identical ontology annotation,resulting in the deviation of results for these gene functional similarity calculation methods.This paper proposes an improved method for measuring the functional similarity of genes. The semantic information content of the annotated term set is normalized for the sake of measuring the functional similarity between genes more accurately. The experimental results show that the proposed method can eliminate the influence of the identical annotation on gene functional similarity calculation methods,and obtain a very good performance on the test platform.