蛋白质折叠规律的研究是生命科学重大前沿课题之一,折叠分类是蛋白质折叠研究的基础。本文基于LIFCA数据库,选取样本量大于2的55种α/β类蛋白质折叠类型为研究对象。结合蛋白质折叠类型的定义及其保守拓扑结构特征,确定了55种蛋白质折叠类型的模板及其对应的特征参数。建立了基于模板的打分函数Mul-Fscore,并结合二级结构参数信息,给出了55种α/β类蛋白质折叠类型的多模板分类方法。用此方法对LIFAC数据库中的931个样本进行检验,分类结果的平均特异性、平均敏感性、MCC值分别为99.58%、79.47%、79.39%。与TM-score分类结果对比发现,Mul-Fscore分类的敏感性与MCC值好于TM-score的相应结果,平均特异性相近。
The research of protein folding pattern is one of the major frontier subjects in life science, and folding isthe basis of protein classification. Based on the LIFCA database, we selected research objects as 55 folding types ofα/β, whose sample sizes are larger than 2. Combining with the definition of protein folding and its conservativetopology characteristics, we determined the templates of the 55 folding types as well as their correspondingcharacteristic parameters. Based on the templates, we built a scoring function: Mul-Fscore. In LIFAC database,based on testing of selected 931 proteins, the average specificity, sensitivity, and MCC values are 99.58%,79.47% and 79.39%, respectively. Compared with TM-score, we found that the sensitivity and MCC values of Mul-Fscore are slightly better, while the average specificity is quite similar.