蛋白质复合体是由两条或多条相关联的多肽链组成,在生物过程中起着重要作用.假如用图表示蛋白质–蛋白质相互作用(protein-proteininteractions,PPI)网络数据,那么从中找出紧密耦合的蛋白质复合体是非常困难的,特别是在近年来PPI网络的容量大大增加的情况下.在本文中,通过对称非负矩阵分解,针对蛋白质复合体检测问题提出了一种图聚类方法,该方法可以有效地从复杂网络中检测密集的连通子图.并且将此方法和当前最先进的一些方法在3个PPI数据集中用同一个基准进行比较.实验结果表明,本文的方法在3个拥有不同大小和密度的数据集中均显著优于其它方法.
Protein complex is a group of two or more associated polypeptide chains which plays essential roles in biological process.Given a graph representing protein-protein interactions(PPI)data,it is important but non-trivial to find protein complexes,the subsets of proteins that are closely coupled,from it,particularly in the condition that the PPI network has increased greatly in capacity in the recent years.In this paper,we propose a graph based clustering approach by adopting symmetric non-negative matrix factorization,which can effectively detect densely connected subgraphs from complex networks.We compare the performance of our approach with state-of-the-art approaches in three PPI networks with a well known benchmark complexes.The experimental results show that our approach significantly outperforms other methods in three PPI networks with different data sizes and densities.