在全基因组关联研究中,尤其是包含罕见变异时,识别基因-基因或基因-环境的交互作用对于统计遗传来说是一个巨大的挑战。提出一个基于基因的两阶段的方法去识别基因-基因和基因-环境的交互作用,这其中亦可包括罕见的变异。第一阶段中,在每一基因型下检验每一个变异的等方差性,这相当于检验一个变异与其他变异或此变异与环境因素之间的交互作用。然后,根据P-值对变异进行排序。在第二阶段中,以一个线性模型为基础,对排序较高的变异进行交互作用的检验。当包含罕见变异时,将把在同-基因内罕见变异合并,形成一个组合的罕见变异(CRV),并把各个CRV看作一个变异。将该两阶段方法应用于第17届遗传分析会议(GAW17)数据集,用以鉴别KDR基因和控制数量性状Q1的吸烟状态之间的交互作用。可以说明该两阶段方法比不使用第一阶段的一步法功效更高。
Identifying gene - gene or gene-environment interactions in genome-wide association studies (GWAS), especially when rare variants are included, is a major challenge in statistical genetics. A gene-based two-stage approach to identify gene-gene and gene-environment interactions in GWAS which can include rare variants is proposed. In the first stage, each variant for equal variance under each genotype is tested, which is equivalent to test interaction between this variant and other variants or environmental factors. Then, variants are ranked according to the p-values. In the second stage, interactions are tested based on a linear model among top ranked variants. When rare variants are included, the rare vari- ants in a gene are collapsed to a combined rare variant (CRV) which is considers as a variant. The twostage approach is applied to the GAW 17 dataset to identify the interaction between KDR gene and smoking status for the quantitative trait Q1. It is demonstrated that the two-stage approach may be more powerful than the one-stage approach that does not use the first stage.