针对高维数据具有低秩形式和属性冗余等特点,提出一种基于属性自表达的无监督超图属性选择算法。该算法首先利用属性自表达特点用其他属性稀疏地表达每个属性,此自表达形式使用低秩假设寻找高维数据的低秩表示,然后建立超图正则化因子保持高维数据的局部结构,最后利用稀疏正则化因子进行属性选择。属性自表达特性确定属性的重要性,低秩表示相当于考虑数据的全局信息进行子空间学习,超图正则化因子考虑数据的局部结构对数据进行子空间学习。该算法实际上考虑数据全局和局部信息进行子空间学习,更是一种嵌入了子空间学习的属性选择算法。实验结果表明,该算法相比其他对比算法,能更有效地选取属性,并能取得很好的分类效果。
Due to that high- dimensional data usually is low-rank and contains redundant features, this paper proposed a novel unsupervised hypergraph feature selection algorithm based on self-representation property of features. First, it considered the self-representation matrix to sparsely represent each feature by a linear combination of other features. Such self-representation property was then enforced a low-rank assumption to learn the low-rank representation of high-dimensional data, via conside- ring the global structure of the data to conduct subspace learning. Second, it considered the local structure of the data by a hy- pergraph based regularizer. In this way, the proposed method integrated subspace learning into the framework of feature selec- tion. Experimental results demonstrate that the proposed can select the best discriminative features and achieve the best classi- fication performance, compared to the competing methods.