利用Perl语言开发了用于探寻基因组SSR的程序SSRFinder,并利用其从法国国家基因测序中心(Genoscope)公布的欧亚种葡萄(Vitis vinifera L.)黑比诺品系PN40024的基因组序列中检索到114 520个SSR。其中含单核苷酸、二核苷酸、三核苷酸、四核苷酸、五核苷酸和六核苷酸重复单元的SSR数目分别为37 648(32.9%)、30 123(26.3%)、18 705(16.3%)、14 566(12.7%)、3 492(3.0%)和9 986(8.7%)个。在各类SSR中,不同核苷酸组成的重复单元频率间存在较大的差异,其中富含A/T重复单元的SSR频率最高,而富含C/G重复单元的SSR频率最低。SSR在基因组上主要分布在基因间隔区,其次是基因的非翻译区,在编码区的分布密度最小。三核苷酸和六核苷酸SSR在翻译区的分布密度明显高于其他类型的SSR。利用这些SSR序列共设计出80065(69.9%)对SSR引物。另外,本研究开发了一个基于Web界面的SSR数据库DGSSR,收录和注释全基因组与EST的SSR,并提供了查询界面,网址为http://www.yaolab.sh.cn/ssr。
We developed a Perl script-SSRFinder to detect SSRs in grape genome sequence. A total of 114 520 SSRs were isolated from publicly available Vitis vinifera L. ‘Pinor Nori PN40024' genomic DNA sequence. Among them, 37 648 mononucleotide repeats, 30 123 dinucleotide repeats, 18 705 trinucleotide repeats, 14 566 tetranucleotide repeats, 3 492 pentanucleotide repeats, and 9 986 hexanucleotide repeats were found, accounting for 32. 9% , 26. 3% , 16. 3% , 12.7% , 3.0% , and 8.7% of the total SSRs respectively. SSRs with poly (A/T),1 repeats represented the most abundant type, whereas C/G-rich motifs were the rarest type. We also assessed the distribution of SSRs on genome fragment. The results showed that the SSRs distributed mainly in intergenic region and were moderately abundant in UTRs. In coding region, the distribution of all repeat types was less frequent except tri- and hexa-nucleotide repeats. To make use of these SSRs, we developed a database on the Internet. The database of grape SSRs (DGSSR) is a database comprehensively collecting and annotating grape SSRs. The DGSSR contains all the SSRs with their related information detected in the study. It provides flexible query interface and detailed annotations for individual SSR. It also contains SSRs detected from Vitis vinifera L. ESTs dataset. The DGSSR is available at http: //www. yaolab, sh. cn/ssr.