关键蛋白质是生物体内维持所有生命活动最重要的物质基础。随着高通量技术的发展,如何从蛋白质相互作用网络中识别出关键蛋白质成为目前蛋白质组学的研究热点。针对大部分现有方法仅仅基于网络拓扑结构信息进行识别以及蛋白质相互作用数据假阳性高的问题,提出了改进的粒子群算法来识别关键蛋白质。通过综合考虑网络拓扑结构特性和多源生物属性信息构建了高质量的加权网络,还考虑使用蛋白质节点间联系的紧密程度来衡量蛋白质的关键性,并扩展局部网络拓扑至二阶邻居,大大提高了预测的准确率。提出了衡量top-p关键蛋白质的整体性指标,降低了计算复杂度。在标准数据集上的实验结果表明,与其他经典算法相比,所提算法更具优势,能够识别出更多的蛋白质,具有较高的准确率。
The essential protein is the most important material basis for the maintenance of all life activities in the living body.With the development of high throughput technology,how to identify the essential proteins from the protein interaction network has become a hot research topic in proteomics.For most of the existing methods are only based on the information of network topology for recognition as well as high false positive of protein-protein interaction data,this paper presented the improved particle swarm algorithm to identify the essential proteins.We considered the network topology characteristics and multi-source biological attribute information to construct the high quality of the weighted networks.We also considered node links between protein to measure the essentiality of protein,and expanded the local network topology to the second-order neighbor,improving the accuracy greatly.We proposed a measure of the overall top-p index,which reduces the computational complexity.The experimental results on standard data sets show that our algorithm is superior to other algorithms in comparison with other classical algorithms,which can identify more proteins with higher accuracy.