蛋白质复合体对于研究细胞活动具有重要意义.随着新的生物实验技术的不断出现,产生了大量的蛋白质相互作用网络.通过对蛋白质相互作用网络进行聚类识别蛋白质复合体是当前研究热点.然而,目前大多数蛋白质复合体识别算法的性能不够理想.为此,提出了蛋白质复合体模块度函数(PQ),并在此基础上提出了基于蛋白质复合体模块度函数的模块合并(based on protein complexes modularity function for merging modules,BMM)算法.BMM算法首先识别网络中一些稠密子图作为初始模块,然后依据PQ函数对这些初始模块进行合并,最终得到了质量较高的蛋白质复合体.将识别出的复合体分别与2种已知的蛋白质复合体数据集进行比对,结果表明BMM算法具有很好的识别性能.此外,与其他最新的识别算法相比,BMM算法的识别准确率较高.
Proteins often interact with each other to form complexes. It is very significant for understanding the activities in cell to carry out their biological functions. In recent years, with the rapid development of new biological experiment technologies, a large amount of protein-protein interaction (PPI) networks are generated. Identifying protein complexes by clustering proteins in PPI networks is hot spot in current bioinformatics research. Many clustering methods, which are mainly based on graph partition or the technologies of community detection in social network, have been proposed to recognize the protein complexes in PPI networks in last decade. However, the performances of most of previous developed detecting methods are not ideal. They cannot identify the overlapping complexes, but according to the biological study found, protein complexes are often overlapping. Therefore, in this paper, a protein complexes modularity function (Q function), namely PQ function, is proposed to identify the overlapping complexes from PPI networks. Based on PQ, a new algorithm for identifying protein complexes BMM (the algorithm based on protein complexes modularity function for merging modules). Firstly, BMM algorithm finds some dense sub-graphs as initial modules. Then, these initial modules are merged by maximizing the modularity function PQ. Finally, several high-quality protein complexes are found. Comparing these protein complexes with two known protein complexes datasets, the results suggest that the performance of BMM is excellent. In addition, compared with other latest algorithms, BMM is more accurate.