等位基因的变异在各种生物中都是普遍存在的,并对基因的表达起着重要的调控作用。为了探索关联分析中品种数目(A)、平均等位基因多态信息含量(B)和候选基因总贡献率(C)对候选基因分析结果的影响,本研究采用经验贝叶斯(E-Bayes)方法探讨了上述因素对候选基因检测功效、遗传效应估计值的准确度和精确度以及假阳性出现频率等的影响。结果表明:(1)随着A、B和C的增加,候选基因的检测功效和效应估计值的准确度和精确度明显提高,假阳性出现的频率降低。(2)B对检测功效有显著的影响。在B值保持较高的水平时,即使品种的数目保持较低的水平以及候选基因的总贡献率较低时,平均检测功效也可达到80%;当B值为中等水平时,需要较大品种数目才能使平均统计功效超过80%;当B值较小时,品种数目即使达到100,3种贡献率水平下的统计功效最高也未达到50%。(3)B对候选基因效应估计值的准确度和精确度有显著的影响。随着B的增加,候选基因效应估计的准确度和精确度增加。(4)B因素对假阳性频率也有显著影响。在实例分析中检测到4个基因与稻米糊化温度显著关联。因此,在进行等位基因功能差异的统计遗传学分析时等位基因多态性是主要的影响因素,同时较多的品种数和较高的贡献率对候选基因的统计功效、效应估计值的准确度和精确度也有重要影响。
Allelic variations are ubiquitous in organisms,and play important roles in regulating genes expression.In order to study the influence of number of varieties(A),average polymorphism information content(B) and total contribution of candidate genes(C) on the association analysis of candidate genes,the empirical Bayes(E-Bayes) method was applied to explore the effects of abovementioned three factors on the statistical power of candidate genes,the accuracy and precision of the estimates of genetic effects and the false discovery rate(FDR).Results were as follows:(1) With the increase of factors A,B,and C,the statistical power and the accuracy and precision of the estimates of genetic effects were all enhanced,meanwhile the FDR was decreased.(2) Factor B had a significant influence on the statistical power of candidate genes.When factor B was at a higher level,the ave-raged statistical power could still reach 80% even though both factors A and C remained at lower levels.When factor B was at a medium level,more varieties were needed to ensure that the statistical power could reach 80%.However,when factor B was at a lower level,even though factor A was equal to 100,the statistical power in three different levels of factor C could not reach 50%.(3) Factor B had a significant impact on the accuracy and precision of estimated effects of candidate genes.With the increase of factor B,both the accuracy and precision of effect estimates for candidate genes were improved simultaneously.(4) Factor B also had an important effect on FDR.Through a real data analysis in rice,four detected candidate genes were significantly associated with pasting temperature(PT) by our model.Therefore,the polymorphism information content is a primary factor for detecting the functional difference of alleles.In addition,more varieties and higher contribution rate also have important influence on the statistical power and the accuracy and precision of estimates of effects.