蛋白质与蛋白质相互作用的识别有助于研究蛋白质功能和发现潜在的药物靶标。本研究采用氨基酸组成、二肽组成、三联子组成、组成、转变、分布和自相关特征对蛋白质与蛋白质相互作用对进行表征。基于最小冗余最大相关方法选择最优特征子集,结合支持向量机对酵母蛋白质与蛋白质相互作用进行了预测研究。通过采用最优特征子集,训练集和测试集的预测精度分别比二肽组成的提高了4%和2%,表明了当前方法的有效性。
Identification of protein-protein interactions can provide useful information to elucidate protein functions and discover drug target. In this study,amino acid composition,dipeptide composition,conjoint triad,composition,transition,distribution and nor-malized Moreau-Broto autocorrelation features are used to characterize protein-protein interactions. Minimum redundancy maximum relevance is employed to select the optimized feature subset,and support vector machine is adopted to construct model and predict protein-protein interactions of saccharomyces. Based on the optimized subset,accuracies of training set and test set are about 5%and 2%higher than those of dipeptide composition,showing the effectiveness of the current method.