目的介绍Bayesian模型平均法的基本原理,并对实际资料进行分析,指出该方法的优越性。方法以Hosmer和Lemeshow研究低出生体重婴儿影响因素的队列研究为例,分别以Bayesian模型平均法和逐步logistic回归法选择最佳模型,并分析比较二者的差异及原因。结果Bayesian模型平均法确定的10个后验概率最大的模型的累积后验概率仅为0.59,模型本身的不确定性是很大的,而逐步logistic回归法确定的最佳模型的后验概率(P(βk≠0|D)〈0.032)要远低于Bayesian模型平均法确定的最佳模型的后验概率(P(βk≠0|D)=0.12)。从回归系数的估计值、标准误和P值比较两种方法的结果发现,Bayesian模型平均法估计的精度较高,而逐步logistic回归法由于没有考虑模型本身的不确定性,偏向于高估结果。结论Bayesian模型平均法考虑了模型本身的不确定性,其分析结果更可靠,在统计建模中具有较好的应用前景。
Objective To introduce the basic theory of Bayesian model averaging and explain its superiority by analysing an actual example. Methods To take the cohort study of Hosmer and Lemeshow as an example, we analyzed it with stepwise logistic regression and Bayesian model averaging respectively, Compared their differences and explained their causes. Results Bayesian model averaging' s cumulative posterior probability of best 10 models is only 0.59, which means the model uncertainty is very large. The posterior probability of best model acquired by stepwise logistic refression (P (βk≠0|D)〈 0. 032) is less than that of best model acquired by BMA ( P (βk≠0|D) = 0.12). From t heir comparison of point estimation, standard error and P - values, We find that BMA has a narrower confidence interval, a higher precision of parameter estimation, while stepwise logistic regression model tends to underestimate model uncertainty, leading to overconfident inferences and decisions that are more risky than one thinks they are. Conclusion Bayesian model averaging takes into account model uncertainty, and it shows a promising prospect in statistical modelling.