提出一种新的融合了基因表达数据和PPI网络的拓扑特性来识别关键蛋白质的中心性测度PeC。对于网络中的每一条边,PeC首先计算该边的聚集系数和该边相连的2个基因(蛋白质)共表达的皮尔逊相关系数,并在此基础上进一步计算出该边的权值。网络中每个节点的PeC值即为其所连接的所有边的权值之和。基于酵母PPI网络上的实验结果表明,PeC明显优于其他8种中心性拓扑参数DC,BC,CC,SC,EC,IC,LAC和SoECC;特别地,在预测的蛋白质数量不大于总数量的10%的情况下,PeC的预测准确率相对于SC,CC和EC提高20%以上。
A new method for identifying essential proteins based on the integration of PPI and gene expression data named PeC was proposed.For each edge of the network,its edge clustering coefficient(ECC)and Pearson correlation coefficient(PCC)were calculated.And then,the weight of each edge was computed based on ECC and PCC.Then,a protein’s PeC value was defined as the sum of weights of the edges connected to it.The experimental results on the yeast protein interaction network show that PeC is obviously higher than other eight centrality measures(DC,BC,CC,SC,EC, IC,LAC and SoECC)in the prediction accuracy of essential proteins.Especially,for less than the top 10%proteins selected as the candidate essential proteins,the prediction accuracy of PeC has 20%higher than those of SC,CC and EC.