缺失值是调查中普遍存在的问题,利用变量之间的相关关系,可以通过正态线形模型利用不存在缺失值的变量对存在缺失值的变量进行插补。较之单一插补,多重插补更能有效地估计总体方差,因此更多地被使用;特别是采用贝叶斯多重插补,其模型的差数和残差估计均来自相应后验分布的随机抽取,这样对总体方差的估计更为精确。通过大量模拟试验,发现贝叶斯多重插补较之单一插补和一般多重插补能构建更宽的置信区间从而有更准确的总体参数覆盖率,这点在数据缺失比重很大时优势更明显。
Missing values,frequently-seen in survey,can be imputed under normal linear model when variables with missing values correlate with variables without missing values.Compared with singular imputation,multiple imputation can efficiently estimate the population variance and is widely used.And the application of Bayesian multiple imputation makes the population variance more accurate because both difference and residuals estimate originate from the random selection of posterior distribution.A large number of simulation tests find that Bayesian multiple imputation,compared with the singular imputation and multiple imputation,could construct a wider confidence interval so as to achieve a higher parameter coverage,which is more convincing when missing data account for a great proportion.