抽样调查领域常采用对多个受访者进行跟踪调查得到面板数据,进而对总体特性进行统计推断,在面板数据中常含缺失数据,大多数处理面板缺失数据的软件都是直接删去含缺失值的受访者以得到完全数据集,当数据缺失机制为非随机缺失时会导致总体参数估计结果有偏。本文针对数据缺失机制为非随机缺失的情形,阐述了如何对面板数据进行统计分析,主要是基于模型的似然推断法,对目标变量、缺失指示变量和随机效应向量的联合分布建模,在已有选择模型和模式混合模型的基础上,引入随机效应,研究目标变量期望的计算方法,并研究随机效应杂合模型下参数的估计方法,在变量分布相对简单的情形下给出了用极大似然法推断总体参数的估计步骤,最后通过模拟分析比较方法的优劣。
In sampling survey,we always interview several interviewees in several fixed time points,so we get panel survey data,then estimate the parameters of the population. Interviewees often drop out of the longitudinal studies prematurely. Most of the software for handling missing longitudinal data discard the incomplete cases directly to yield biased inferences when the missing mechanism is nonignorable. It discusses how to analyze the panel data in the case of nonignorable missing mechanism. It uses the model-based likelihood inference method to construct the joint distribution of the target variable,the drop-out process and the random effects,and discusses the calculation of the target variable's expectation as well as the estimation of parameter in pattern mixture model. It also gives some examples for some relatively simple distributions to show the inference based on the maximum likelihood method and does the comparative analysis between these methods by simulation.