[目的]通过试验获得茶树CHS基因(CsCHS)gDNA序列,以进一步确定CsCHS基因结构;研究CsCHS编码区单核苷酸多态性,并结合茶树多酚含量进行关联分析,寻找基因中可能存在的与茶多酚含量存在显著或极显著相关关系的SNP位点。[方法]根据NCBI数据库中已有CsCHS序列设计特异性引物,再分别以基因组DNA和cDNA为模板进行PCR扩增,经克隆、测序获得CsCHS1、CsCHS2、CsCHS3三个基因的gDNA和cDNA全长序列,通过序列比对方法确定CsCHS结构。利用Compute pI/Mw、SOPMA等软件对所得序列进行生物信息学分析,预测和比较CsCHS1、CsCHS2和CsCHS3三者蛋白质结构。以茶多酚含量差异较大的57份茶树品种为材料,分别以57份材料的cDNA为模板,用特异性引物进行PCR扩增,然后利用PCR产物直接测序法筛查CsCHS编码区序列的单核苷酸多态性。结合CsCHS编码区序列单核苷酸多态性和57份材料多酚含量,利用软件TASSEL进行关联分析,筛选基因中可能与茶多酚含量存在显著或极显著相关关系的SNP位点。[结果]试验获得CsCHS1、CsCHS2和CsCHS3的cDNA序列长度分别为1277、1320和1242 bp,各自均包含一个长度为1170 bp的开放阅读框;CsCHS1、CsCHS2和CsCHS3的gDNA序列长度分别为1600、1330和1607 bp。通过gDNA序列和cDAN序列比对,结合真核生物内含子GT-AG法则,确定CsCHS1、CsCHS3分别包含2个外显子和1个内含子,内含子大小分别为323和356 bp, CsCHS2可能没有内含子。根据CsCHS1、CsCHS2和CsCHS3 cDNA序列推导三者对应氨基酸序列,比较三条氨基酸序列发现三者氨基酸同源性较高,达到92.6%-95.4%,CHS蛋白亚家族中的特征性保守位点在这三条序列中都能找到,生物信息学分析结果显示CsCHS1、CsCHS2和CsCHS3三者蛋白质结构高度相似。CsCHS1编码区序列中共发现71个SNP位点,SNP出现频率为1SNP/16.48 bp,无Indel,基因核苷酸多样性(π)值为0.0
[Objective] The objectives of this study were to determine the structures of CHS genes in Camellia sinensis (CsCHS) by obtaining gDNA sequences of these genes, and to analyze single nucleotide polymorphism. Besides, association analysis was also carried out in order to find potential SNP sites in CsCHS which would influence the polyphenol content in tea plant.[Method] Based on CsCHS sequences uploaded to NCBI, specific primers were designed using primer 3.0 software. Genomic DNA and cDNA of tea leaves were used as templates in polymerase chain reaction (PCR) to obtain gDNA and cDNA sequences of CsCHS1, CsCHS2 and CsCHS3, respectively. Gene structures were determined through blasting gDNA and cDNA sequences. The putative amino acid sequences were analyzed by bioinformatics softwares, such as Compute pI/Mw, SOPMA, and so on. Single nucleotide polymorphism of CsCHS were analyzed in 57 cultivars with great variation polyphenol contents. In order to obtain the coding region of CsCHS genes, PCR reactions with specific primers were carried out by using cDNA of individual tea cultivars as template. TASSEL software was introduced in association analysis.[Result] cDNA sequences of CsCHS1, CsCHS2 and CsCHS3 were 1 277 bp, 1 320 bp and 1 242 bp, respectively. And an open reading frame (ORF) of 1 170 bp was found in each CsCHS gene. gDNA sequences of CsCHS1, CsCHS2, and CsCHS3 were 1 600 bp, 1 330 bp and 1 607 bp, respectively. By comparing gDNA and cDNA sequences of each CsCHS gene, combined with GT-AG rule, it was determined that CsCHS1 and CsCHS2 have two exons and one intron, respectively. And the intron in CsCHS1 is 323 bp, compared with 356 bp in CsCHS3. While no interruption region was found in CsCHS2, this might prove there is no intron in CsCHS2. Deduced amino acid sequences analysis suggests that identity of amino acid sequences are 92.6%-95.4%. All of the conserved amino acids found in CHS protein subfamily were also found in these deduced sequences. Furthermore, bioinformatics analysis showed highly si