当前蛋白质功能注释体系的欠完备性限制了生物学及医药研究的进一步发展和应用,有必要将基因本体功能知识体系(GO)的功能注释信息进一步深化,把蛋白质注释到GO中更具体的功能节点。为此,提出一种结合互作信息的新的预测策略,将酵母蛋白质准确地预测到更具体的功能类中。针对GO中的每个候选预测空间来构建分类器,并选用功能类分离性指标对候选预测空间进行评价,选出该指标大于一定阈值的候选预测空间,再将父节点中的蛋白质预测到子节点中。通过扩展深化预测的范围,可将预测空间一直上溯到根节点,对蛋白质功能进行深层预测,得到很好的预测结果。以上溯两层的预测空间为例,平均真阳性率和覆盖率分别达到94.02%和95.82%。
The specificity limitation of functional annotations of many proteins affects their further applications in biology and medicine. It is necessary to push the protein functional annotations deeper in Gene Ontology (GO), or to predict further annotated proteins with more specific GO terms. A novel computational algorithm for precisely predicting the functions of proteins deep into a specific class has been developed in this article. Local classifiers were constructed in local classification spaces rooted at qualified parent nodes in GO, their classification performances were evaluated with the Index. Classification spaces with higher Index were selected out, and the proteins annotated to the parent classes were predicted to the child classes. This algorithm can push the prior general annotation(s) one or several levels deeper, leading to more specific protein functional knowledge. With the method proposed in this paper, we have got satisfied predicting results. In the predicting spaces whose level equals 2, the average precision and recall rate were 94.02 % and 95.82 %, respectively.