针对目前拷贝数变异检测存在的参数优化,额外信息利用不充分等问题,提出一种基于隐马尔科夫模型的拷贝数变异检测算法。首先对读数据与参考序列比对并存储匹配失效的数据,实现窗口读数据的计数和平滑校正;然后引入隐马尔科夫模型对读计数的异常信号进行检测,得出候选的拷贝数检测结果;最后采用基于匹配失效数据的裂读比对实现候选结果的过滤,从而提高检测性能。模拟和实验数据的拷贝数变异检测结果表明本算法具有较高的检测精度和覆盖度,优于现有常用的检测算法。
Towards the limitation of parameters optimization and not making full use of additional information in detection of copy number variants, this paper proposed a new detection algorithm based on hidden Markov model. Firstly, it aligned the reads to the reference and saved the unmatched reads, counted the matched read and normalized them smoothly. Then, it de- signed hidden Markov model to realize the detection of abnormal signal of read count data and gave the candidate copy number detection results. It finally used the split read method based on unmatched reads above to filter the candidate results so as to improve the detection performance. The experiment results of copy number variant detection from both simulation and experimental data indicate that proposing algorithm has high detection accuracy/coverage and is better than current detection algo-rithms.