为了有效地利用蛋白质串联质谱数据,进一步提高蛋白质鉴定的准确性,提出一种基于决策树的蛋白质鉴定结果的二次评价算法.目前,串联质谱已经成为解决蛋白质鉴定问题的最为有效的技术手段.随着蛋白质串联质谱数据的大量聚集,蛋白质鉴定算法也日益增加.然而,现有的蛋白质鉴定算法通常返回数量庞大的结果列袁,因此对列表中的鉴定结果进行二次评价是提高蛋白质鉴定准确性的一个重要环节.针对此问题,首先利用频繁模式挖掘方法获得了b离子的特征信息,进而基于决策树理论提出一种蛋白质鉴定结果的二次评价算法一即ReCheck算法.实验结果表明,该算法有效的提高了蛋白质鉴定的准确性.
To efficiently use the protein tandem mass spectra data and improve the accuracy of the protein identification,a results re-evaluation method is proposed based on a decision tree model.Recently,tandem mass spectrometry has become the most powerful tool for protein identification due to its high sensitivity and accuracy,and many protein identification algorithms have been proposed.However,searching protein database by use of spectra often returns a rank which contains a huge number of results,therefore the results re-evaluation is an important step for improving the accuracy of identification.Focusing on this problem,the frequent pattern mining method is first used to discover the characteristics of the b-ions and then a result re-evaluation algorithm named ReCheck is proposed based on the decision tree model.The experimental results show ReCheck algorithm improves the accuracy of the protein identification.