Proteasomes are responsible for the production of the majority of cytotoxic T lymphocyte(CTL) epitopes.Hence,it is important to identify correctly which peptides will be generated by proteasomes from an unknown protein.However,the pool of proteasome cleavage data used in the prediction algorithms,whether from major histocompatibility complex(MHC) I ligand or in vitro digestion data,is not identical to in vivo proteasomal digestion products.Therefore,the accuracy and reliability of these models still need to be improved.In this paper,three types of proteasomal cleavage data,constitutive proteasome(cCP),immunoproteasome(iCP) in vitro cleavage,and MHC I ligand data,were used for training cleave-site predictive methods based on the kernel-function stabilized matrix method(KSMM).The predictive accuracies of the KSMM+pair coefficients were 75.0%,72.3%,and 83.1% for cCP,iCP,and MHC I ligand data,respectively,which were comparable to the results from support vector machine(SVM).The three proteasomal cleavage methods were combined in turn with MHC I-peptide binding predictions to model MHC I-peptide processing and the presentation pathway.These integrations markedly improved MHC I peptide identification,increasing area under the receiver operator characteristics(ROC) curve(AUC) values from 0.82 to 0.91.The results suggested that both MHC I ligand and proteasomal in vitro degradation data can give an exact simulation of in vivo processed digestion.The information extracted from cCP and iCP in vitro cleavage data demonstrated that both cCP and iCP are selective in their usage of peptide bonds for cleavage.
Proteasomes are responsible for the production of the majority of cytotoxic T lymphocyte(CTL) epitopes.Hence,it is important to identify correctly which peptides will be generated by proteasomes from an unknown protein.However,the pool of proteasome cleavage data used in the prediction algorithms,whether from major histocompatibility complex(MHC) I ligand or in vitro digestion data,is not identical to in vivo proteasomal digestion products.Therefore,the accuracy and reliability of these models still need to be improved.In this paper,three types of proteasomal cleavage data,constitutive proteasome(cCP),immunoproteasome(iCP) in vitro cleavage,and MHC I ligand data,were used for training cleave-site predictive methods based on the kernel-function stabilized matrix method(KSMM).The predictive accuracies of the KSMM+pair coefficients were 75.0%,72.3%,and 83.1% for cCP,iCP,and MHC I ligand data,respectively,which were comparable to the results from support vector machine(SVM).The three proteasomal cleavage methods were combined in turn with MHC I-peptide binding predictions to model MHC I-peptide processing and the presentation pathway.These integrations markedly improved MHC I peptide identification,increasing area under the receiver operator characteristics(ROC) curve(AUC) values from 0.82 to 0.91.The results suggested that both MHC I ligand and proteasomal in vitro degradation data can give an exact simulation of in vivo processed digestion.The information extracted from cCP and iCP in vitro cleavage data demonstrated that both cCP and iCP are selective in their usage of peptide bonds for cleavage.