微卫星是真核生物基因组中的一类高度重复的序列,一般分布在内含子区和基因间隔区中,但基因编码区也含有一定数量的微卫星。为探讨含有微卫星的基因表达频率是否偏低,对NCBI公共数据库中的421725条杨树EST序列进行了分析,结果发现:其中53524条EST序列中含有微卫星,含微卫星的EST序列比例是12.69%;而杨树基因组注释的45555个基因中,有6953个基因含有微卫星,含微卫星的基因占基因总数的比例为15.26%。对两样本频率进行差异显著性检验,结果显示微卫星在表达序列中的发生频率显著低于在注释基因中的发生频率(p〈0.01),这说明含有微卫星的基因总体上表达水平偏低。而对表达序列中微卫星的特征进行分析的结果显示,三碱基重复微卫星含量最丰富。在此,笔者提出了基因组中含有微卫星的基因可能总体表达水平偏低的假说,并利用杨树公共数据库中海量DNA序列对这一假说进行了验证。
Microsatellites are highly repetitive sequences in eukaryotic genomes,which are commonly found in the intronic and intergenic regions.The genic regions also contain a number of microsatellites.Microsatellites are the most variable sequences in the genomes of different organisms.Mutation in microsatellite sequences will lead genes to produce shorter or completely different proteins.Thus,genes contains microsatellites would be strongly affected by selection.Low expression level is supposed to be one of the mechanisms that relax the selection against the corresponding genes and help their survival.In this paper,we analyzed 421 725 poplar ESTs in the publicly available NCBI database and detected 53 524 ESTs contained microsatellites,accounting for 12.69 % of the investigated ESTs.Whereas in the 45 555 gene models annotated from the poplar genome sequences,6 953 genes contained microsatellites,accounting for 15.26 % of the total genes.Based on the frequency test between the EST database and gene database,microsatellites were found to occur with significantly lower frequency in ESTs than in annotated genes(p0.01).Therefore,the results proved that the microsatellites frequency in expressed genes was lower than that of the expected level for all genes.The characteristics of microsatellite in ESTs were also explored in this study.The result showed that triplets were the most frequent microsatellites in ESTs.In this paper,the hypothesis that genes containing microsatellites might have low expression level is proposed for the first time.Meanwhile,a large number of ESTs are analyzed to verify this hypothesis.This study provides important evidences for us to understand the survival mechanism of microsatellites in genes.