重建生物进化树一直以来都是进化生物学家的梦想。大量物种全基因组的测序使得我们可以从全基因组水平上构建进化树,来研究各个物种之间的进化关系。本文采用2种统计方法和3种距离计算方法,在全基因组水平上建立基于蛋白质结构的进化树。选取93个物种的全基因组作为分析对象,涵盖了3个超界:真核生物,细菌和古细菌。而结果也正确地将这些物种分为三个大类,每个大分支内部的物种聚类情况也基本和这些物种的形态学分类相吻合。并将这些方法的聚类结果与物种分类的结果相比较,得出丰度的统计方法和基于两向量夹角的距离计算方法这种组合在构建进化树上比其他组合更好。
Reconstruction the tree of life is always the dream of evolutionary biologists.We can determine phylogeny of species on the whole-genome level when a large amount of whole-genome data is sequencing.2 statistics methods and 3 distance calculation methods is used to reconstruct the tree on whole-genome level based on protein structure.93 whole-genomes include 3 superkingdom(eukarya,bacteria and archaea) are selected to do this work.The results divide the taxa into the three big branches.And the detail of each branch is similar to its taxonomy result.The comparison between these results and the taxonomy result indicates that the statistical method of abundance and the distance method of vector angle is the best combination on constructing evolutionary tree.