水平转移基因的预测对于生物进化过程的理解和物种之间遗传物质进行定性和定量的估计都有重要的意义。本文提出一种利用仿生模式识别原理来对细菌基因组水平转移基因进行预测的方法。仿生模式识别是基于同调连续性原理--特征空间中同类样本的连续性特性,强调用“认识”模式取代传统的模式“分类”与划分,它更接近于人类“认识”事物的特性。仿生模式识别理论已经成功应用于多镜头人脸身份确认,人脸识别,图像复原,语音识别等领域。我们采用超香肠神经元网络对水平转移基因进行识别,结果显示,仿生模式识别方法优于目前预测结果最好的八联核苷酸频率的打分算法,和基于支撑向量机的识别算法。特别是在对大肠杆菌(Escherichia coli K12)基因组,识别率分别提高了42.3%和30.5%。
The prediction of horizontal gene transfers(HGT)has important meaning to understanding the evolution and estimating the inherit material between species.A novel approach based on biomimetic pattern recognition(BPR)was proposed to predict the horizontal gene transfers in bacterial genomes.Biomimetic pattern recognition is much closed to the function of human being,which is based on"cognition"instead of"classification".The basis of BPR is the principle of homology-continuity(PHC),which means the difference between two samples of the same class must be gradually changed.The aim of BPR is to find an optimal covering in the feature space,which emphasized the"similarity"among homologous group members,rather than"division"in traditional pattern recognition.The application,such as human-face identification,face recognition,image restoration,speech recognition,has been realized successfully based on BPR.A neuron model called hyper sausage neuron(HSN)as a kind of covering units in BPR was used.The performance of the approach was superior to that of gene scoring method of 8-nucleotide composition(W8)and the support vector machine (SVM).The results of experiments showed the hit ratio for Escherichia coli K12 had a high improvement of 42.3% compared with that of W8,and 30.5% improvement compared with that of SVM.