建兰(Cymbidium ensifolium)因基因组信息缺乏导致其分子标记的开发和利用受到一定限制。为了开发可靠、有效的分子标记,本研究对建兰转录组进行了大规模测序,并对此数据进行简单重复序列(simple sequence repeat,SSR)搜索。根据基元、长度和所在位置选取80对genic—SSR进行引物设计;在12个建兰品种中进行扩增验证,结果发现,61对引物成功扩增,其中39对引物有多态性,其多态信息含量(polymorphism information content,PIC)范围为0.141~0.746,平均为0.348;对成功扩增的61个genic-SSR所在的独立基因(unigene)序列进行包括非冗余(non—redundant,NR)数据库、基因本体论(gene ontology,GO)、真核细胞的直系同源组feukaryotic orthologous groups,KOGs)和代谢途径(kyoto encyclopedia of genes and genomes,KEGG)的注释,发现52个genic—SSR所在的unigenes比对上了NR的序列、38个unigenes归入了25个GO词汇、18个归入了9个KOG分类,15对genic—SSR涉及到了14条KEGG途径。以上结果表明,来自于建兰转录组的39对引物不但具有一定的多态性,而且通过注释赋予了其许多基因功能的信息,是图谱构建和基因定位的理想分子标记。由于这些引物具有一定的通用性,还可用于整个兰属遗传多样性和群体结构的分析。
Owing to the lack of Cymbidium ensifolium genomic resource, the development and application of molecular markers have been limited. In order to obtain more efficient markers, we sequenced C. ensifolium transcriptome deeply to search simple sequence repeat (SSR). Here, high-quality reads were assembled into 101 423 unigenes. The unigenes were used to explore genie-SSR derived from gene (genie-SSR), which resulted in 17 793 genic-SSRs. The location of the SSR was estimated according to the open reading frame (ORF) and untranslated region (UTR) within the unigenes. A total of 3 065 genie-SSRs was located within 5'UTR: 5 235 was within 3'UTR; 2 907 was within coding region; 6 586 was undetermined. A total of 80 was chosen from genic-SSRs sets, based on their motif, size and location. The primers were designed for 80 genie-SSRs using software “PRIMER 3.0”, and then tested for their amplification. Of 80 genic-SSR primers, 61succeeded in PCR amplification and 39 showed polymorphism among 12 C. ensifolium accessions. Among 39 polymorphic markers, the polymorphism information content (PIC) averaged 0.348, ranging from 0.141 to 0.746. The 39 polymorphic markers identified 108 alleles among 12 C. ensifolium accessions, and the allelic number was 2.77 per locus, ranging from 2 to 6. The SSR26 was the most polymorphic marker, with PIC of 0.746 and 6 alleles. The Nei genetic distance was estimated for each pair of the 12 C. ensifolium accessions, which ranged from 0.010 to 0.475, with an average 0.260. In addition, unigenes containing genic-SSR were annotated on the basis of BLAST similarity searches. The unigenes containing the 61 genic-SSRs were searched against non-redundant (Nr), gene ontology database (GO), eukaryotic orthologous groups (KOGs) and kyoto encyclopedia of genes and genomes (KEGG) database, respectively. Most of them were annotated as crucial genes that were associated with important biological function. There were 52 unigenes hit Nr sequences, of which 38 were cl