不同阶数插值形式的马尔可夫内插模型,可以表示在一个DNA序列中相邻核苷酸之间的前后关系的变化。本研究将内插马尔可夫模型引入Gibbs采样算法,识别基因上游序列中的调控元件。对模拟序列和10组来源于文献的酵母基因序列的测试结果表明,改进后的算法在识别保守性差的调控元件和抗噪声能力方面均优于传统的Gibbs采样算法。
Interpolated Markov Model (IMM), which combines several Markov models with different orders, can capture variable context dependencies between nearby nucleotides depending on the local composition of DNA sequence. Based on IMM, a polished Gibbs sampling algorithm has been developed in this work to detect the regulatory elements. Simulated data and real biological sequences from yeast Saccharomyces cerevisiae were used to test the polished algorithm. Results indicated that polished Gibbs sampling exhibited better performance in extracting the less-conserved elements and dealing with noisy sequences than single nucleotide independent model.