模糊聚类技术在揭示数据的联系及数据间的依赖关系等方面占有重要作用.作者根据模糊等价关系的性质,利用加权汉明距离法,建立了序列间的模糊邻近关系,进而构造模糊等价矩阵进行聚类分析.选取了18条人类p53及其家族成员p63、p73肿瘤蛋白mRNA序列,以序列碱基的比例为指标,进行聚类分析.发现,当λ=0.951 59时,结果分为3类,p53、p63、p73各自成为一类.从聚类角度看p53、p63、p73的结构和功能是有区别的.此方法对预测未知基因的结构和功能有一定的生物意义.
Fuzzy clustering technology plays an important role in revealing the connection and dependence between the data.Therefore,according to the properties of fuzzy equivalence relation,this paper use the weighted hamming distance method to establish fuzzy proximity relation between sequences,and then make fuzzy equivalence matrix to being clustering analysis.Then select 18 human p53 and its family members p63 and p73 tumor protein mRNA sequences,use the base contents of sequences as index to being clustering analysis.Find that when λ=0.951 59,the result is divided into 3 groups,p53,p63 and p73 each becomes a class.From the perspective of clustering,p53,p63 and p73 have differences in structures and functions.This method has biological significance in predicting the structures and functions of unknown genes.