目的探讨LASSO方法用于高维度、强相关、小样本的生存资料分析。方法介绍LASSO的基本原理及方法步骤,分别拟合Van'tVeer等的乳腺癌基因数据的Cox回归模型(逐步法)和LASSO模型,以作为标准来进行模型评价与比较。结果采用逐步法筛选出的自变量个数多于LASSO,但模型的决定系数低于LASSO,说明LASSO方法通过将一些没有意义或意义很小的变量系数压缩为0之后,得到的模型反而更优。结论 LASSO通过在系数的绝对值和上增加一个约束条件来对高维资料进行降维,且得到拟合效果更好的模型,比较适合于基因数据的生存分析。
Objective To explore the application of LASSO in survival analysis with high dimension,strong correlation and small sample data.Methods Analysis of Van't Veer's genes data showed that in high dimension data,more variables entered in stepwise Cox model than LASSO,but with lower than LASSO.Results Analysis of Van't Veer's genes data showed that in high dimension data,more variables entered in stepwise Cox model than LASSO,but with lower than LASSO.Conclusion With a constraint condition in the sum of the absolute terms,LASSO could reduce dimension of independent variable and produces a better model fit,it would be suitable for analyzing the genetic survival data.