目的通过比较倾向评分(propensityscore,PS)回归法与传统logistic回归法处理多重共线性资料结果的差异,探讨Ps回归法处理多重共线性资料的统计性质及其应用特点。方法采用MonteCarlo(MC)模拟法,分别从样本量大小、协变量与暴露变量相关性的不同水平进行模拟研究,比较Ps回归法与logistic回归法处理多重共线性资料的差异。结果(1)当固定结局变量阳性率(4%),协变量与暴露变量相关性较高(r=0.92)时,Ps回归的回归系数较logistic回归更接近标准模型的估计值,但随着样本量的增加,两模型回归系数的估计逐渐趋于一致,而且估计误差会越来越小。(2)当样本量固定时,与传统logistic回归相比,Ps回归的回归系数更接近标准模型的回归系数,同时,Ps回归计算的回归系数随着协变量与暴露变量相关性的变化趋势与标准模型变化趋势一致,即两模型回归系数估计并不受共线性程度的影响;而logistic回归模型的估计值(包括回归系数和其标准误)可同时受到样本量和协变量与暴露变量相关性的影响,导致参数估计的偏性。样本量与相关性的不同组合,对logistic回归参数估计的偏离程度不同。因此,与一般logistic回归模型相比,Ps回归法在样本量较小的资料中对共线性处理的优势更为明显。结论基于上述结果,我们认为在处理具有多重共线性的数据时,Ps回归的参数估计较logistic回归的参数估计更为可靠,特别是对于样本量小、变量间共线性程度较高的数据更应考虑使用Ps回归予以处理,以避免参数估计的偏倚。
Objective The aim of our study was to find whether propensity score( PS ) method wgiykd be better in parameter esti- mates than common logistic regression model in dealing with collinearity data by different sample size and degree of collinearity. Methods A lo- gistic model was used as a standard model in our study since it was assigned to deal with non-collinearity data. Monte Carlo simulation was employed to compare parameter estimates between PS regression and common logistic regression in dealing with collinearity data under conditions of different sample size and degree of collinearity. Results ( i ) Given a 14% posi- tive proportion of outcome variable and a 0.92 correlation coefficient (r) between covariates and exposure factor the parameter estimates, either re-gression coefficient or its standard error from PS regression, were close to parameters estimated from standard regression model, compared to the com- mon logistic regression. These differences of parameter estimates were grad- ually disappeared along with increase of sample size. (2) Given sample size of 1000 and 500 and 4% positive proportion of outcome variable, we estimated regression coefficient and its standard error from three models a- long with degree of collinearity. The trend of parameters estimated from PS regression was parallel with the trend of standard model. It means the differ- ence between these two models is consistent. However, the change of re- gression coefficient and standard error estimated from the common logistic regression were parallel with changes of two models mentioned above when r is in a low level. But it changes its direction at r = 0.5 ( n = 1000 ) or r = 0. 3 ( n = 500 ). Conclusion The parameters estimated from PS regres- sion were more reliable than the common logistic regression, especially un- der the conditions of small sample size and data with severe collinearity. Therefore, PS regression could be one of excellent methods in dealing with collinearity data.