P53家族基因和肿瘤密切相关,为研究其进化差异性,首先利用混沌游走表示的方法,将p53家族基因的CDS序列映射为平面直角坐标系中的点列。继而,利用点列的坐标值构造加权特征向量,以此对CDS序列进行数值刻画。接着,基于欧式距离对CDS序列作聚类分析,并定义组内精度和分组优度来评价聚类分析的结果。最后,结合基因分段和方差分析等方法进行深入研究,从而获得更为理想的分类效果。结果表明,p53家族基因的进化差异性主要体现在CDS序列的前三分之二段和加权特征向量的第2维和第7维上,这对研究p53家族基因序列具有重要意义。
P53 family genes are closely related to the tumor. To study the evolutionary diversity between them, the authors firstly applied chaos game representation to map p53 family genes' CDS sequences as point series in the rectangular plane coordinate system. Secondly, by translating the coordinates of point series, the weighted feature vector was used as numerical characterization of CDS sequences. What's more, we did cluster analysis on CDS sequences based on Euclidean distance, and defined accuracy in group and group optimization measurement to evaluate the results. In the end, gene dividing and analysis of variance were employed in the deep research, and a more ideal clustering effect was obtained. The results indicate that, the evolutionary diversity of p53 family mainly reflects in CDS sequences' first 2/3 segments and feature vextor 's second and seventh dimension. The conclusion is of vital importance to study the sequences of p53 family genes.