DNA的精确复制是保证基因组完整性的关键。裂殖酵母是研究真核生物DNA复制等基因表达调控过程的重要模式生物。作者统计分析了裂殖酵母复制起始区的碱基组分、k-mer(k=2~4)频数、GC/AT位点偏差和GC/AT位点组分。结果发现:复制起始区的AT含量显著高于非复制起始区,且富含WW(W为A或T)二联体,而非复制起始区富含SS(S为G或C)二联体;复制起始区和非复制起始区的GC/AT位点偏差、GC/AT位点组分也存在显著差异。另外,作者仅基于DNA序列或结构特征,使用支持向量机区分了裂殖酵母复制起始区和非复制起始区,预测总精度约70%。表明DNA序列对裂殖酵母的复制起始有重要影响。该研究对进一步阐明真核生物的复制机制有重要的理论意义。
The faithful DNA replication is critical for the genome integrity. S. pombe is a kind of important model organisms for studying the gene expression and regulation such as DNA replication, etc. The frequencies of single nucleotides and k-mer (k = 2-4), GC/AT bias and GC/AT profile in the vicinity of S. pombe replication origins were analyzed. The results showed that AT content was higher in regions of replication origin than in regions of non-replication origin. The WW (W = A or T) dinucleotides were favoured in regions of replication origin, whereas SS (S = G or C) dinucleotides were favoured in regions of non-replication origin. GC/AT skew and GC/AT profile in regions of replication origin was significantly different from that in regions of non-replication origin. Furthermore, by inputting DNA sequence or structural features into the support vector machine, the authors classified the sequences of replication origin and non-replication origin in S. pombe, and the total prediction accuracy was about 70%. This result indicates that DNA sequence plays an important role in specification of replication origins in S. pombe. This work is helpful for elucidating the mechanism of DNA replication in eukaryotic organisms.