为提高蛋白质二级结构预测的精度,提出了一种基于GEP—BP网络集成的两层结构预测模型。首先利用基因表达式编程(GEP)的全局搜索能力同时进化设计BP网络的结构和连接权,并将进化最后一代的个体用BP算法进一步训练学习,然后采用组合方法将部分个体集成构成模型的第一层;根据神经网络输出之间具有相关性,用第二层网络对第一层的预测结果进行精炼。用PDBSelect25中的36条蛋白质共6122个残基进行测试,结果表明提出的模型能有效预测蛋白质二级结构,将预测精度提高到73.02%。
In order to improve the prediction accuracy of protein secondary structure, this paper presented a new prediction model composed of two-level network based on GEP-BP network ensemble. Firstly, evolved simultaneously the structure and connection weights of BP network were by using global research ability of GEP, then trained fatherly all the individuals of last generation by BP algorithm and formed the first-level through a combination method to ensemble part of individuals. Secondly, according to the dependency of neighboring neural network output, refined the results of the first-level by the second-level net- work. Employed the model to predict 36 nonhomologous protein sequences with 6122 residues in PDBSeleet25. The results show that the proposed model can efficiently improve the prediction accuracy, increasing prediction accuracy to 73.02%.