首先把线虫、酵母和拟南芥三种模式生物基因组中的内含子、外显子和基因间序列归为三类,滑动统计这些序列中64种三核苷的重复出现次数作为离散源的状态参数,这样就得到了这些序列的64维特征值,并将这些数据分成训练样本集和测试样本集。根据免疫进化网络理论,用离散增量作为抗体一抗原间的亲和度函数,把训练集看成抗原,不断刺激免疫网络向识别抗原的方向进化,构造了一个基于离散增量的免疫分类器。通过测试表明,该分类器性能优良,分类预测准确率达到了85%以上。
This paper divided the DNA sequences of three model species including C. elegans, S. cerevisiae and A. thaliana in- to three kinds: intron, exon, intergenic DNA. At first, composed a group of status parameters of the source of diversity of the re- spective frequency of the 64 codons of a sequence which were sliding calculated. Expressed each sequence of intron, exon and intergenic DNA by a 64-dimension eignvalue and divided all of these sequences' 64-dimension samples into training sample set and testing sample set. Then according to the immune network theory,constructed a immune classifier based on the increment of diversity, which applied the increment of diversity as affinity function and made the immune network evolve in the direction of identifying the antigens by extracting unceasingly the antigen, which was an element of the training sample set, to stimulate the immune network. Finally, the classifier has high performance and its prediction accuracy is up to 85% by test.