利用不同基因组全序列中寡聚核苷酸频率组分差异信息构建系统树,并与传统方法的建树结果比较,分析以寡聚核苷酸频率组分差异信息构建系统树的可行性及适用范围。在获得194种原核生物基因组全序列寡聚核苷酸频率数据的基础上,依据寡聚核苷酸组分的保守性构建权重矩阵,对寡聚核苷酸组分差异加权,并尝试用绝对值距离、Lance距离、欧氏距离三种方法计算距离矩阵,再用距离法构建系统树。结果显示,该方法对科以下分类单位的建树与传统方法的建树基本一致,而对科以上的分类单位较难区分。首次利用寡聚核苷酸频率保守性进行加权,通过非比对的算法用基因组全序列建树。
Phylogenetic trees were constructed based on information from oligonucleotide frequency differences across genomes in this paper. Frequency data from 194 prokaryotic genomes were calculated and analyzed for their characteristics of conservation across genomes. Oligonucleotide frequency differences among genomes were weighted according to the characteristics of conservation. The weighted differences were then used as measures of genetic distances among genomes to construct phylogenetic trees. Compared with the traditional method, similar trees were obtained for classification units lower than family, while trees for classification units higher than family were different. The phylogenetic trees constructed in this study are true whole-genome trees. The approach used is alignment-free and specific in weighing the oligonucleotide frequency differences.