6类红斑鳞状皮肤病的诊断一直是皮肤科的难题。皮肤病数据是名词性定性数据,采用定量数据处理方法是不太适合的。本文提出了组套索罚多值回归分类器新方法用于名词性数据的特征选择和分类,并应用于红斑鳞状皮肤病诊断。首先将前33维名词性数据进行虚拟编码,将第34维年龄数据离散化后进行虚拟编码;将得到的虚拟编码数据按照类别分组和变量分组,并送入组套索罚多值回归分类器,通过10折交叉验证,分类正确率达到了98.88%±0.0023%。与其他文献方法相比,本文方法简单,分类效果好且效率高,可解释性强,稳定性强。
Six kinds of erythemato-squamous diseases have been common skin diseases, but the diagnosis of them has always been a problem. The quantitative data processing method is not suitable for erythemato-squamous data be- cause they are categorical qualitative data. This paper proposed a new method based on group lasso penalized classifi- cation for the feature selection and classification for erythemato-squamous data with categorical qualitative data. The first categorical data of 33 dimensions were changed by the virtual code, and then 34th dimension age data were dis- cretized and changed by the virtual code. Then the encoded data were grouped according to class group and variable group. Lastly Group Lasso penalized classification was executed. The classified accuracy of 10-fold cross validation was 98.88%±0. 002 3% Compared with those of other method in the literature, this new method is simpler, and better for effect and efficiency, and has stronger interpretability and stronger stability.