尿道致病性大肠杆菌UPECCFT073菌株furopathogenic Escherichia coli CFT073)于2002年被完全测序并注释。但是,对其基因组的研究还很不完善,首先表现在基因组注释的系统性错误和滞后性。作者运用一系列生物信息学方法和工具,从编码蛋白质基因、编码RNA基因等角度对RefSeq数据库的基因组注释进行了系统的修正和增补,并在此基础上鉴别了一批新的候选致病因子基因。进一步的分析表明,得到的基因组注释对CFT073致病相关的一些重要调控关系和机制能够给出更准确、完整的描述。
Uropathogenic Escherichia coli (UPEC) strain CFT073 was sequenced with the complete genome and published in 2002. However, the state-of-the-art genomic studies on CFT073 are not yet satisfactory, largely due to the systematic errors and being outdated with the current gene annotation in public database. In this paper, the authors carried out a systematic re-annotation by combining a series of bioinformatics tools and manual efforts to provide a comprehensive understanding of virulence for the CFT073 genome. The re-annotation results in corrections, modifications and supplements to both protein-coding genes and RNA-coding genes. Based on the re-annotation, a group of new virulence factor candidates was listed. Further analysis demonstrated that the present work may facilitate an accurate and comprehensive depiction of virulence mechanisms of the CFT073 strain.