利用隐马尔可夫模型训练中不同结构的DNA序列的L值分布范围不同的特点,对传统多类投票模型进行改进,提出一种优于传统算法的快速训练算法,该算法只需训练出一类隐马尔可夫模型参数.对DNA内含子和外显子序列进行识别,平均识别率达到了90.8%.与支持向量机相比,隐马尔可夫模型在解决多分类问题方面具有优势,不但计算时间少,而且识别率高.
According to the distribution variation of the L value with the DNA sequence structure in the hidden Markov model (HMM) training and by improving the traditional muhiclass vote model, a fast training algorithm superior to the traditional One is proposed to recognize the intron and exon of the DNA sequence. The proposed algorithm only need to train one class of parameter of HMM model and the average accuracy rate of it reaches 90. 8%. As compared with the support vector machine, the proposed HMM model is more feasible in the multiclass classification and is of less time cost and higher recognition rate.