【目的】芍药花色的优劣影响其观赏价值和商业价值,研究芍药花色调控基因的密码子使用偏好性和密码子使用模式的影响因素,为芍药花色调控基因在m RNA翻译、转基因设计、新基因表达与功能预测以及分子生物进化研究提供参考。【方法】根据前期芍药花色嵌合体品种‘金辉’转录组测序筛选的6 345个芍药花色调控基因,并根据CDS序列特征和大于300 bp原则进行过滤后最终获得的2 234个基因序列作为研究对象,利用Mobyle软件计算GC含量、第1与2位密码子的平均GC含量(GC12)、第3位密码子的GC含量(GC3s)、有效密码子数ENC、密码子适应指数CAI、相对同义密码子使用度RSCU等密码子偏性指标,其次进行中性绘图(GC12 vs.GC3)、ENC-GC3s绘图以及PR2(Parity Rule 2)绘图分析,并运用多元统计分析方法探讨突变压力和选择作用对密码子使用模式的影响程度,最后以5%CAI值作为高、低表达样本组,计算这两个样本组的同义密码子相对使用度,利用卡方检验Chi-square test分析两组之间的显著性差异来确定最优密码子。【结果】芍药花色相关基因的密码子GC3s含量为46.37%,大部分基因GC含量主要分布在30%—55%;中性绘图分析表明GC3s与GC12呈极显著的正相关(R2=0.202,P〈0.01);ENC-GC3s绘图表明大部分基因分布在标准曲线周围,也有一部分基因分布在标准曲线下方较远的位置,同时大部分基因(ENCexp-ENCobs)/ENCexp比值集中分布在0.0—0.4;PR2绘图分析显示密码子第三位T的使用频率高于A,C使用频率高于G,表明嘌呤(A和G)与嘧啶(T和C)的使用频率并不均衡;对应性分析COA(Correspondence Analysis)表明,第一轴上显示了38.09%的差异,其他3个轴分别为18.42%、15.09%、14.59%,表明芍药花色调控基因的密码子使用模式评价以第一轴(Axis 1)为主;突变压力和选择作用分析发现,第一主轴与GC3s、CAI的相关系数
【Objective】 The quality of Paeonia lactiflora flower color affects its ornamental value and the commercial value of ornamental plants.This study aims to understand the codon usage pattern of genes regulating flower color and probe into the main factors affecting the formation of codon bias,which has important biological significance for m RNA translation,design of transgenes,the prediction of expression level and functions of new genes,and studies of molecular biology and evolution,etc.【Method】In a previous study,6,345 differential genes were screened out by transcriptome sequencing of a flower color chimaera cultivar "Jinhui" with a consistent genetic background red outer-petal and yellow inner-petal,followed by a further filtering analysis according to the principle of CDS sequence characteristics and greater than 300 bp.We finally obtained 2,234 genes as our research object.Mobyle software was used to calculate different parameters for the codon usage,such as GC content,average GC content of the first and second positions(GC12),GC content of the third position(GC3s),effective number of codon(ENC),codon adaptation index(CAI),and relative synonymous codon usage(RSCU).Further analysis of a neutrality plot(GC12 vs.GC3),an ENC-GC3 s plot,and a Parity Rule 2(PR2) plot were performed.Additionally,we probed into the influence of mutational pressure and translational selection by a multivariate statistical analysis.Finally,we took 5% CAI value as high-expression and low-expression sample groups,then calculated the RSCU value,and analyzed the significant difference to determine the optimal codons by a chi-square test.【Result】The results showed that the GC content at the third position of codons was 46.37%.The GC content of most genes was mainly distributed between 30% and 55%.Neutrality analysis showed that there was a significant positive correlation(R2=0.202,P〈0.01) between GC3 s and GC12 value.The ENC-plot showed most of the genes on or close to the expected curve,but