针对蛋白质相互作用的预测问题,提出了集成学习的方法。该方法使用人工神经网络和支持向量机为成员分类器的集成学习方法,并分别用自协方差编码方式和二肽组成来表示蛋白质序列的特征集合,预测的准确率和ROC曲线面积分别达到92.16%、94.38%和0.972 5、0.981 5。通过对成员分类器、集成学习方法以及集成学习方法之间的预测效果进行比较,结果表明,集成学习方法可获得更优的预测效果,并能有效提高预测精度,避免采样学习带来的不稳定性。
To effectively predict the protein-protein interactions,an ensemble learning with artificial neural network and supporting vector machine as the individuals was used,and the feature selection of protein sequence was indicated by auto covariance and dipeptide composition.The prediction accuracy and AUC reached 92.16%,94.38% and 0.972 5,0.981 5,respectively.A comparison among artificial neural network,supporting vector machine and ensemble learning showed that the ensemble learning revealed more superior performance than the others and can improve the prediction accuracy while avoiding the instability of prediction caused by sampling learning.