在序列拼接中,为了解决重复序列这个难题,本文提出了利用KMP匹配算法来识别并屏蔽重复序列的方法.该方法利用模式序列中的失效函数计算得到失效链接值,也就是当前一位置匹配失败后,下一次匹配开始的位置.利用这一函数避免了可预见的无用搜索,将穷举搜索算法所需的计算量大大减少.通过计算机模拟,验证了对重复序列的屏蔽,该算法将穷举算法所需时间复杂度由原来的减少到了.
An approach based on KMP algorithm is proposed with the aim of dealing with the problem of the repeats during the sequence assembly. According to the failure function traits of KMP algorithm, this method can drastically decrease the CPU computation time to search the probable position of query sequences. Computer simulations have proved that the performance of this method is improved from to.