东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于多特征信息及Ma-Ada多分类器融合的蛋白质结构类预测

ISSN号：0258-8021
期刊名称：《中国生物医学工程学报》
时间：0
分类：R318[医药卫生—生物医学工程;医药卫生—基础医学]
作者机构：[1]杭州电子科技大学生命信息与仪器工程学院,杭州310018
相关基金：国家自然科学基金（61271063）;国家重点基础研究发展计划（973计划）（2013CB329502）;国家杰出青年科学基金（60788101）

关键词：蛋白质结构类预测, 特征信息集, Ma-Ada多分类器融合, protein structural class prediction, feature information set, Ma-Ada multi-classifier fusion

中文摘要：

蛋白质序列特征表示和机器学习算法是影响蛋白质结构类预测效果好坏的两个重要方面.本研究基于k-字统计频率和k-片段位置分布两种特征提取方法,将分别提取到的氨基酸序列信息和物理化学性质信息同蛋白质二级结构信息进行融合,建立17维和57维的特征信息集,并尝试在Adaboost.M1算法中引入Multi-Agent多智能体融合的思想,提出了一种Ma-Ada多分类器融合算法.该算法作为蛋白质结构类的预测工具,充分挖掘了单分类器度量层信息以及各个单分类器之间的交互融合信息.实验结果表明,Ma-Ada算法在Z277、Z498、1189和D640四个蛋白质数据集的57维特征信息集上的分类率分别达到了91.3％、96.8％、85.3％和87.2％,在17维特征信息集上的分类率也分别达到了90.6％、95.8％、84.8％和88.3％.与其它蛋白质结构类预测方法的结果相比,本方法能够获得较好的分类率.

英文摘要：

Protein sequence feature and machine learning algorithm are two important aspects to determine the results of protein structural class prediction. In this study, we established 17-D and 57-D feature information sets through fusing the sequence information, physical and chemical information with the secondary structure information based on the k-word statistical frequency and the k-fragment distribution feature extraction method. By introducing Multi-Agent＇s idea into Adaboost. M1 algorithm, a novel method for protein structural class prediction, called Ma-Ada multi-classifier fusion algorithm, was proposed, which fully utilized the information of the single classifier metric layer and the fusion of information among individual classifiers. Four protein datasets including Z277, Z498, 1189, D640 were used to validate the performance of the Ma-Ada algorithm. Classification accuracies are 91.3 % , 96.8 % , 85.3% and 87.2 % with 57 -D features, and 90.6 % , 95. 8 % , 84.8 % and 88.3 % with 17 D features on datasets Z277, Z498, 1189 and D640, respectively. The experimental results show better.

同期刊论文项目