当前,基于生物质谱进行蛋白质鉴定的技术已经成为蛋白质组学研究的支撑技术之一.产生的数据主要使用数据库搜索的方法进行处理,这种方法的一大缺陷是不能鉴定数据库中未包含的蛋白质,因此如何充分利用质谱数据对蛋白质组研究的意义很大,而新蛋白质鉴定更是其中一个重要的内容.新蛋白质鉴定是蛋白质鉴定的一个方面,新蛋白质的定义按照序列和功能的已知程度分为3个层次;以蛋白质鉴定的方法为基础,目前新蛋白质鉴定的方法可分为denovo测序和相似序列搜索结合的方法以及搜索EST、基因组等核酸数据库的方法2大类;两者各有利弊.存在各自的问题和相应处理的策略.不同的研究者可以根据具体目的应用和发展不同的鉴定方法,同时新蛋白质的鉴定也将随着蛋白质组学研究的发展而更加完善.
The combination of tandem spectrometry and database searching is one of the most popular technologies for protein identification. However, only those proteins in the searching database could be identified, and current database is far from completeness. So it is necessary to mining the MS/MS data comprehensively, in which novel protein identification is the most important one. The definition of novel protein could be divided into three levels according to their annotations of sequences and functions. As a part of protein identification, the main approaches used to identify novel protein are basing on the following two different ways: de novo sequencing combined with similarity search and searching against nucleotide acid databases such as EST or genome databases. Several mature or newly developed methods and techniques were summarized, and the problems and strategies discussed here would be helpful for the related researches.