通过对桉树属(Eucalyptus)的10000条EST序列进行分析,在其中的1499条序列上共发现1775个微卫星重复序列。含有微卫星的EST序列约占序列总数的15%。此外,还发现桉树EST序列所含微卫星长度的变异速率与重复单元长度呈负相关;微卫星的丰度与重复单元长度也呈负相关(三碱基重复微卫星除外)。在桉树EST序列中,重复单元长度为三碱基的微卫星最为丰富。三碱基重复单元微卫星的过度富集可能是由于遗传密码选择所致。在微卫星的丰度及长度变异方面,桉树EST序列与杨树(Populus trichocarpa)基因组注释的转录序列随重复单元长度的变化呈现出相同的规律,但桉树EST序列中微卫星频率及三碱基重复微卫星的含量显著偏低,推测含微卫星的基因表达丰度极有可能低于不含微卫星的基因。通过对发现的所有微卫星位点进行引物设计,并对设计的引物进行PCR检测,结果表明所设计的引物具有极高的扩增成功率。
We analyzed 10 000 expressed sequence tags (ESTs) of Eucalyptus deposited in GenBank and detected 1 775 microsatellites distributed in 1 499 EST sequences. Thus, about 15% of the EST sequences contain one or more microsatellites in the genome of Eucalyptus. Diversification of microsatellite lengths was negatively correlated with their repeat motif lengths. Apart from the triplet repeats, the abundance of the other three classes of microsatellites was also negatively correlated with their repeat motif lengths. Triplet repeats are the most abundant microsatellites in the EST sequences of Eucalyptus. The overabundance of triplet repeats might result from genetic code selection. A comparison of microsatellites in the EST sequences of Eucalyptus and the transcript sequences annotated from poplar genome sequences revealed similar variation trends in microsatellite lengths and abundance with their repeat motif lengths. However, the microsatellite content and frequency of triplet repeats were significantly lower in Eucalyptus than in poplar. This might relate to the lower expression of microsatellite-containing genes. We subsequently designed simple sequence repeat (SSR) primers and successfully detected microsatellite loci.