变量选择是统计建模的重要环节,选择合适的变量可以建立结构简单、含义明确、预测精准的稳健模型。在实际应用中,有些变量具有群组结构,本文概括了三类群组变量选择惩罚方法,包括处理高度相关变量、仅选择组变量、即选择组又选择单个变量的方法,着重比较了它们的统计性质和优缺点,总结了相关算法和调整参数选择的方法。最后文章归纳了相关应用情况,并讨论了最新发展方向和所面临的挑战。
Variable selection is of great importance in statistical modeling. Suitable variables can make the model simple, meaningful and have favorite performance of prediction. Actually, there exist group structures among the predictors. This paper gives a review of three types of penalized group variable selection methods, including strongly correlated variable selection, group level selection and bi-level selection. We highlight their statistical properties, advantages and disadvantages. We also summarize the algorithms and tuning parameter selection. We discuss their applications, the further studies and the challenges in the end.