位置:成果数据库 > 期刊 > 期刊详情页
shesispca:一个基于GPU软件纠正群体分层,有效地加快了处理流程的全基因组数据集
  • ISSN号:1673-8527
  • 期刊名称:《遗传学报:英文版》
  • 时间:0
  • 分类:TP393.08[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术] Q78[生物学—分子生物学]
  • 作者机构:[1]Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Shanghai Jiao Tong University, Shanghai 200230, China, [2]Institute of Social Cognitive and Behavioral Sciences, Shanghai Jiao Tong University, Shanghai 200240, China, [3]School of Bio-medical Engineering, Shanghai Jiao Tong University, Shanghai 200230, China, [4]Shanghai Changning Mental Health Center, Shanghai 200042, China
  • 相关基金:ACKNOWLEDGMENTS This work was supported by the National Key Basic Research Program of China (973 Program) (No. 2015CB559100), the National High Technology Research and Development Program of China (863 Program) (Nos. 2012AA02A515 and 2012AA021802), the Natural Science Foundation of China (Nos. 31325014, 81130022, 81272302 and 81421061), the National Program for Support of Top-Notch Young Professionals, the Program of Shanghai Subject Chief Scientist (No. 15XD1502200), "Shu Guang" project supported by Shanghai Municipal Education Commission and Shanghai Education Development Foundation (No. 12SG17).
中文摘要:

人群分层是遗传关联研究的一个问题,因为它可能会突出位点的基础人口结构而非疾病相关的基因位点。目前,主成分分析法已被证明是一种有效的方法来纠正人口分层。然而,传统的主成分分析算法在处理大型数据集时耗时。我们开发了一个图形处理单元(GPU)的基于PCA软件shesispca(http://analysis.bio-x.cn/shesismain.htm)是高度并行的一个最高加速比大于100的CPU版本比较。一种基于X-means聚类算法也被实现为一个方法来检测人群和获得匹配的病例和对照组为了降低通货膨胀和增加功率的基因组。对模拟和真实数据集的一项研究表明,shesispca跑在一个非常高的速度,精度不降低。因此,shesispca可以帮助纠正群体分层算法比基于传统的CPU更高效。

英文摘要:

Population stratification is a problem in genetic association studies because it is likely to highlight loci that underlie the population structure rather than disease-related loci. At present, principal component analysis (PCA) has been proven to be an effective way to correct for population stratification. However, the conventional PCA algorithm is time-consuming when dealing with large datasets. We developed a Graphic processing unit (GPU)-based PCA software named SHEsisPCA (http://analysis.bio-x.cn/SHEsisMain.htm) that is highly parallel with a highest speedup greater than 100 compared with its CPU version. A cluster algorithm based on X-means was also implemented as a way to detect population subgroups and to obtain matched cases and controls in order to reduce the genomic inflation and increase the power. A study of both simulated and real datasets showed that SHEsisPCA ran at an extremely high speed while the accuracy was hardly reduced. Therefore, SHEsisPCA can help correct for population stratification much more efficiently than the conventional CPU-based algorithms.

同期刊论文项目
同项目期刊论文
期刊信息
  • 《遗传学报:英文版》
  • 北大核心期刊(2004版)
  • 主管单位:中国科学院
  • 主办单位:中国科学院遗传与发育生物学研究所 中国遗传学会
  • 主编:薛勇彪
  • 地址:北京市安定门外大屯路中科院遗传发育所
  • 邮编:100101
  • 邮箱:ycxb@genetics.ac.cn
  • 电话:010-64807669
  • 国际标准刊号:ISSN:1673-8527
  • 国内统一刊号:ISSN:11-5450/R
  • 邮发代号:2-819
  • 获奖情况:
  • 1996年获中科院优秀期刊二等奖,1997年获全国优秀期刊三等奖,200年获中科院优秀期刊二等奖
  • 国内外数据库收录:
  • 俄罗斯文摘杂志,美国化学文摘(网络版),英国农业与生物科学研究中心文摘,波兰哥白尼索引,荷兰文摘与引文数据库,荷兰医学文摘,美国生物医学检索系统,美国科学引文索引(扩展库),美国生物科学数据库,英国动物学记录,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊(2004版),中国北大核心期刊(2000版)
  • 被引量:17519