针对3-状态隐马尔可夫模型(hidden Markov model,HMM)预测蛋白质二级结构准确率不高的问题,提出15-状态HMM,通过改进的算法与BP神经网络相结合进行二级结构预测。研究对象为CB513数据集中筛选出的492条蛋白质序列,将其随机均分7组。应用混合模型进行预测,对准确率进行7-交叉验证,Q3准确率达77.21%,SOV值为72.52%。结果表明,混合模型既能充分考虑相邻氨基酸残基间的相互影响,也能在一定程度上照顾二级结构的远程相关性,因此带来了较好的预测准确率。
Aimed at the lower accuracy of 3-state hidden Markov model for protein secondary structure prediction, proposed 15-state HMM. Using modified algorithm of HMM to predict secondary structure combined with BP neural networks. Selected 492 proteins from the dataset CB513, and divided them into 7 even subsets. Applied the hybrid model to predict secondary structure and evaluated its accuracy by 7-fold cross validation. The hybrid model appeared to be very efficient, with Q3 score of 77.21% and SOV of 72.52%. The results show that the hybrid model not only captures the local information, but also considers the long-distance information. So it gets higher prediction accuracy.