目的准确预测蛋白质结构类,为研究其空间结构及生物功能打下基础。方法应用隐马尔可夫模型(HMM)预测蛋白质结构类,分别构建3-状态HMM和8-状态HMM。数据来源于Chou和Zhou构建的蛋白质数据集,分别包含有204条蛋白质序列和498条蛋白质序列,通过留-法预测其准确率。结果所构建的3-状态HMM和8-状态HMM对全d类的预测准确率最高,尤其是3-状态HMM的预测准确率达到95%以上。与Chou数据集相比,Zhou数据集对于全B类和α/β类的预测准确率也有所提高,同时,总体预测率也提高了2%左右;但仅+B类的预测准确率有所下降。结论将整条蛋白质序列作为预测模型的输入信息所构建的HMM模型能有效地预测蛋白质的结构类。
Objective Predicting protein structural class is the basis for predicting protein spatial structure, so it is important to improve the prediction accuracy of protein structural class. Methods We proposed 3-state and 8-state Hidden Markov model (HMM), and applied these HMMs to the prediction of protein structural class, respectively. We evaluated their accuracy on two different datasets through the rigorous jackknife cross- validation test. Results Prediction ability of 8-state HMM and 3-state HMM to all α class were excellent, the prediction accuracy of 3-state HMM even reached above 95%. Compared with Chou data set, the prediction accuracy of Zhou data set for allβ class and α/β class of was improved, while overall prediction accuracy increased by 2%. Conclusion HMM is an effective method to predict protein structural class.