蛋白质折叠子预测为启发式搜索蛋白质三级结构提供了有用的信息。目前已知的折叠子预测方法大多数基于单种特征或多种特征的简单组合,本文采用一种多特征融合方法,从蛋白质的一级序列出发,对27类折叠子进行预测。使用支持向量机作为分类器,采用多对多的多类分类策略,以氨基酸组成成分、极性、极化性、范德瓦尔斯量、疏水性和预测的二级结构作为样本的六种特征,进行多特征融合,独立样本预测总精度为59.22%,与Ding等人的结果比较提高了3.2%,结果表明多特征融合方法是一种有效的蛋白质折叠子预测方法。
Protein fold prediction provides useful information for the heuristic search of protein tertiary structure. Many former fold prediction methods are based on a single feature or a simple combination of several features, and this paper presents a novel approach using multi-feature fusion (MFF) to make a 27-class fold prediction from primary structure of proteins. In this paper, we take support vector machine (SVM) as classifier, All-Versus-All as multi-class classification method. We use amino acid composition, polarity, polarizability, van der Waals volume, hydrophobieity and predicted secondary structure as features. Finally the prediction of the testing set was implemented by sixteen fusion schemes, and the better accuracy 59.22% is achieved and increases 3.2% than Ding' s. The result and comparison with Ding' s work show the effectiveness of MFF.