利用生物信息学方法,收集整理GenBank数据库中截至2008年5月收录的油料作物油菜、花生、芝麻、大豆、向日葵、蓖麻、亚麻、棕榈等八种油料作物的表达序列标签(EST)序列信息,共获得1,185,911条EST 序列,使用Crosmatch、RepeatMasker、Phrap、CAP3、EMBOSS、Blast、EST-pipeline、ORF finder、Interproscan、blast2go、IdentiCS等软件,基于Linux 操作系统,进行了综合及分类分析.共获得289,892条UniEST序列,通过以上对EST序列信息的基因注释信息,筛选出与油脂代谢相关的基因信息,并以此为基础构建了油料作物油脂代谢途径比较结构图.本研究为油料作物油脂代谢相关基因数据库的构建和不同油料作物油脂代谢异同的比较打下基础.
This is the first report of a systematic study of genes expressed by means of expressed sequence tag (EST) analysis in oil crops. A total of 1,185,911 EST sequences were thus obtained from GenBank. Cluster analysis enabled the identification of contigs and singletons which resulted as 289,892 UniEST sequences. Putative functions were assigned to about average 60.6% of these non - iedundant UniESTs. From this gene function annotation resources, we screened the genes which are related to the oil synthesis metabol- ic pathway, and constructed the compared metabolic pathway network of these eight oil crops. This EST information exploration work reconstructed the potential metabolic network from public EST sequences resource, carried out visualization and exploration of highly complex genomic level metabolic networks, and can accelerate the use of EST data for studying and comparing the difference of seed oil content between this common oil crops.