位置:成果数据库 > 期刊 > 期刊详情页
BIGpre: A Quality Assessment Package for Next-Generation Sequencing Data
  • ISSN号:1672-0229
  • 期刊名称:《基因组蛋白质组与生物信息学报:英文版》
  • 分类:TP311.13[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术] O144[理学—数学;理学—基础数学]
  • 作者机构:[1]CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China, [2]James D. Watson Institute of Genome Sciences, Zhejiang University, Hangzhou 31007, China
  • 相关基金:supported by the National Natural Science Foundation of China (Grant No.31000561 and 30900825);the Knowledge Innovation Program of the Chinese Academy of Sciences (Grant No.KSCX2-EW-R-01-04)
中文摘要:

技术显著地改进了定序产量并且减少的下一代的定序的出现(NGS ) 花费。然而,短读的长度,副本读并且数据的巨大的体积使处理的数据比归化为美国人的定序技术更困难、复杂。尽管有包裹开发了估计数据质量的某软件,那些包裹任何一个不对用户容易可得到或要求生物信息学技巧和计算机资源。而且,当前可得到的几乎所有优秀评价软件当在 NGS 数据处理副本评价时,考虑定序的错误。这里,我们在场一个新用户友好的优秀评价软件包裹叫了 BIGpre,它为 Illumina 和 454 个平台工作。BIGpre 包含另外的优秀评价软件的所有函数,例如关联在之间前面、反向读,读 GC 内容分发,和基础 N 质量。更重要地, BIGpre 合并联系程序检测并且搬迁副本在订定序错误进报道并且整修低质量以后读也从未加工的数据读。BIGpre 首先在 Perl 被写并且从统计包裹 R 集成图形的能力。这个包裹为从 Illumina 和 454 个平台定序数据集生产数据质量的平坦、图形的摘要。处理几百百万在分钟以内读,这个包裹提供立即的诊断信息让用户操作为下游的分析定序数据。BIGpre 在 http://bigpre.sourceforge.net 是自由地可得到的。

英文摘要:

The emergence of next-generation sequencing (NGS) technologies has significantly improved sequencing throughput and reduced costs. However, the short read length, duplicate reads and massive volume of data make the data processing much more difficult and complicated than the first-generation sequencing technology. Al- though there are some software packages developed to assess the data quality, those packages either are not easily available to users or require bioinformatics skills and computer resources. Moreover, almost all the quality assessment software currently available didn't taken into account the sequencing errors when dealing with the du- plicate assessment in NGS data. Here, we present a new user-friendly quality assessment software package called BIGpre, which works for both Illumina and 454 platforms. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read GC-content distribution, and base Ns quality. More importantly, BIGpre incorporates associated programs to detect and remove duplicate reads after taking sequencing errors into account and trimming low quality reads from raw data as well. BIGpre is primarily written in Perl and integrates graphical capability from the statistics package R. This package produces both tabular and graphical summaries of data quality for sequencing datasets from Illumina and 454 platforms. Processing hundreds of millions reads within minutes, this package provides immediate diagnostic information for user to manipulate sequencing data for downstream analyses. BIGpre is freely available at http://bigpre.sourceforge.net/.

同期刊论文项目
同项目期刊论文
期刊信息
  • 《基因组蛋白质组与生物信息学报:英文版》
  • 主管单位:中国科学院
  • 主办单位:中科院北京基因组研究所
  • 主编:
  • 地址:北京市朝阳区北土城西路7号中科院北京基因组研究所
  • 邮编:100029
  • 邮箱:editor@big.ac.cn
  • 电话:010-82995372
  • 国际标准刊号:ISSN:1672-0229
  • 国内统一刊号:ISSN:11-4926/Q
  • 邮发代号:82-557
  • 获奖情况:
  • 国内外数据库收录:
  • 俄罗斯文摘杂志,美国化学文摘(网络版),波兰哥白尼索引,荷兰文摘与引文数据库,荷兰医学文摘,美国生物医学检索系统,美国剑桥科学文摘,美国生物科学数据库
  • 被引量:52