目的通过从同一总体中抽样产生不同样本量及相同样本量的重复抽样数据集,来观察并评价样本量对重要性估计方法的影响以及重复抽样过程对各方法估计稳定性的影响。方法简单介绍已有的几种重要性评价方法,调用SAS中的PROC SURVEYSELECT过程从同一总体中重复抽样,观察样本量变化、重复抽样过程对重要性估计结果的影响,评价各方法的稳定性。结果样本量较小时,各方法的重要性估计值变异较大,随着样本量增大估计值也逐渐趋于稳定。优势分析、相对权重、乘积尺度(缈)的重要性估计值之和与模型R^2之差,小于标准回归系数平方(β^2)、简单相关系数平方(R^2),优势分析法的稳定性最好。结论在现有的几种常见重要性估计方法中,优势分析法的重要性估计稳定性最好,相对权重法虽然与优势分析法最为接近,但仍有不足之处。
Objective Implement random sample from a simulation population, to evaluate the The impact of sample- size and sample-process on several usual importance evaluate methods, observe the stability of those methods. Methods This study introduced existed importance methods, using PROC SURVEYSELECT procedure to sample a fixed population for 1000 times, generating 1000 same size sample, to evaluate the stability of relative importance methods. We sampled the population to generate datasets with different sample size to observe impact of sample-size on those methods. Results The sum of squared correlation coefficients' estimator is bigger than model R-square, squared standardized regression coefficients' sum is smaller. In contrary, sum of the Product Measure, Relative Weight and Dominance Analysis are extremely close to model R-square. When the sample size small than 1000, the estimator have obviously variation, but the variation decreased when the sample size rise up. Conclusion The dominance analysis has best stability,also has the best match of model R2 in those methods.