以闪烁古生球菌、枯草杆菌、大肠杆菌、酵母、线虫和拟南芥基因组中所有ORF序列为样本,对编码序列中同义密码子重复序列和对应氨基酸重复序列的有关性质进行了分析.结果显示同义密码子重复序列中密码子的使用与全部编码序列的密码子使用有明显的差别,部分密码子的使用随着重复长度的增加,偏置增大.相同密码子2联体在同义密码子4和5联体中的位置明显偏好两端.同时给出了各密码子的联体能力,讨论了联体能力的大小与密码子碱基构成的关系.对于氨基酸重复序列,氨基酸重复片段数的对数随重复长度分布星对数下降.重复序列中氨基酸的含量与全序列氨基酸的含量有显著差别,有些氨基酸随着重复长度的增加其含量偏置增大.亲水、小体积和非极性氨基酸容易形成长的联体,疏水、极性和大体积氨基酸形成的最大联体长度较短.最后,对6种模式生物进行了比较,讨论了上述结果与生物进化的关系.
Taking all of ORF sequences in genome of Archaeoglobus fulgidus,Bacillus subtilis, Escherichia coii, Saccharornyces cerevisiae, Caenorhabditis elegans and Arabidopsis thaliana as specimens, some properties of synonymous codon repeat sequences and amino acid repeat sequences translated by them are analyzed in coding sequences. The result shows that there is a distinct difference to relative synonymous codon usage in synonymous codon repeat sequences and all of the coding sequences,and the bias about part codons usage increases with repeat length. The position about 2 repeat segments of the same codons in 4 and 5 repeat segments evidently biases against two ends. In the same time,the repeat ability of each codon is presented,and the relationship between magnitude of the repeat ability and the composition of codon is discussed. For amino acid repeat sequences ,the logarithm of the number of amino acid repeat segments declines in term of logarithm along with the distribution of repeat segment length. There is distinct difference about the content of amino acid in repeat sequences and in all the sequences, the content bias of some amino acids increases with repeat segment length's increasing. Hydrophobic,unbulky and nonpolar amino acids form long repeat segment easily,the maximal repeat segment length formed by hydrophilic,polar and bulky amino acid is relatively short. Finally,the six model genomes are compared,and the relationship between the results and biological evolution is discussed.