利用Hiseq2500高通量测序平台对哈茨木霉Th-33全基因组进行序列测定,获得196个scaffolds,共预测了10849个基因,平均长度为1776 bp(Gen Bank登录号:PRJNA272949)。以GO(gene ontology)数据库对预测出的基因做基因注释,共注释基因6238个;以KEGG(kyoto encyclopedia of genes and genomes pathway database)数据库对预测出的基因做基因注释,有6789个基因注释到279条KEGG代谢途径。KEGG富集分析显示,对氨基苯甲酸甲酯降解代谢通路涉及基因最多,有232个基因;其次是双酚降解代谢通路,有206个基因。利用Rfam数据库对基因组序列进行RNA分类预测,共分为25个类别,包含7123个基因,其中涉及基因最多的为转录后修饰、蛋白质翻转和分子伴侣一类。比较了哈茨木霉、深绿木霉、绿木霉以及里氏木霉基因组中重寄生相关的碳水化合物活性酶、蛋白酶及次生代谢相关基因。研究结果有助于深入了解木霉菌的生防机制、推动木霉菌功能基因的挖掘和利用。
The whole genome of Trichoderma harzianum Th-33 was sequenced on a Hiseq2500 instrument. A total of 196 scaffolds were assembled and 10849 genes were predicted with an average length of 1776 bp(Gen Bank number: PRJNA272949). 6238 predicted genes were annotated in gene ontology(GO) and 6789 predicted genes were mapped to 279 Kyoto Encyclopedia of Genes and Genomes(KEGG) pathways. KEGG enrichment analysis showed that the significantly enriched pathways were amino benzoate degradation pathway involving 232 genes and bisphenol degradation pathway involving 206 genes. Rfam was used for the RNA classification and predication of the genome. A total of 7123 genes were assigned to 25 categories, and the largest number of genes was involved in posttranslational modification, protein turnover, chaperones category. The mycoparasitism-related carbohydrate active enzymes, proteases and secondary metabolites-related genes in the T. harzianum, T. atroviride, T. virens and T. reesei genome were compared. The data in this study offer a better understanding of the biocontrol mechanism of Trichoderma and explore the functional genes in Trichoderma.