研究人类蛋白质相互作用网络中,癌症基因和非癌症基因在拓扑结构上的差异性,进而为潜在的癌症基因挖掘提供重要依据。首先利用多个数据库,构建一个较为全面的人类蛋白质相互作用(protein—protein interaction,PPI)网络,并搜集已知癌症基因;然后从PPI网络中抽取大量随机样本,并分别统计分析随机样本和已知癌症基因在节点度、节点介数和最大连接组件上的差异程度。得到了如下结果:(1)构建了一个较为全面的人类蛋白质相互作用网络;(2)获得乳腺癌已知癌症基因集;(3)已知癌症基因集和PPI网络随机样本在节点度、节点介数以及最大连接组件等拓扑结构属性上差异有显著性(P〈2.2×10^(-16))。PPI网络的节点度、节点介数和最大连接组件等拓扑结构属性能够显著地区分癌症基因与非癌症相关基因,为新的未知癌症基因的发掘提供了重要考察依据。
Analyzing differences between cancer-related and unrelated genes based on their topological attributes in the protein-protein interaction (PPI) network. We constructed a comprehensive human PPI network with data collected from multiple PPI databases; obtained a set of known breast cancer genes from the OMIM database; and compared cancer-related and unrelated genes based on their topological attributes in the PPI network. The results obtained as follows : ( 1 ) We constructed a comprehensive human PPI network and obtained a known breast cancer gene set ; (2) We showed that known cancer genes exhibit significantly different network topologies from random genes in the network. A comprehensive PPI network can help us to differentiate cancer genes from unrelated ones by their distinctive topological attributes, such as node degrees, node betweenness and maximally connected components.