为提高跨膜蛋白两亲螺旋区域(Amphipathic helices,AHs)预测的精度,基于蛋白质位置特异性得分矩阵、二级结构以及疏水矩,提出了一种新的衡量两亲性的螺旋周期性(Helix periodicity,HP)特征;利用Mem Brain预测器滤除跨膜区域片段并使用下采样的方法,降低了AHs的搜索空间;在此基础上训练基于支持向量机(Support vector machine,SVM)的集成分类器用于AHs预测。为了客观评价AHs的预测性能,首次构建了领域内较为完备可用的标准数据集。在此数据集上的实验结果表明所提方法优于其他AHs预测方法。
In order to improve the prediction accuracy of amphipathic helices ( AHs ), this paper develops a novel helix periodicity(HP) feature based on the position specific scoring matrix (PSSM), protein secondary structure and hydrophobic moment. MemBrain predictor is utilized to cut off the transmembrane segments;under-sampling and classifier ensemble are applied to cope with class imbalance. This paper implementes an ensembled support vector machine (SVM) classifier for performing AHs prediction. To objectively evaluate the prediction performance of AHs, a relative large benchmark data set regarding AHs prediction is constructed. Rigorous experimental tests demonstrate that the proposed method outperforms the existing AHs predictors on benchmark dataset.