特征选择是机器学习和数据挖掘领域的关键问题之一,而特征选择的稳定性也是目前的一个研究热点.基于能量学习模型,分析了基于局部能量的特征选择方法并根据集成特征选择的原理,对基于局部能量的特征排序结果进行集成,以提高算法的稳定性.在现实数据集上的实验结果表明集成特征选择可以有效提高算法的稳定性.
Feature selection is one of the key problems in machine learning and data mining to reduce the dimensionality of data, and the stability of feature selection is one of the current hot points. Stability is the insensitivity of the result of a feature selection algorithm to variations of the training set. This issue is particularly critical for applications where feature selection is used as a knowledge discovery tool for identifying characteristic markers to explain the observed phenomena. In the paper, on the one hand, a feature selection algorithm-Lmba is introduced in detail, and the evaluation criterion is deeply analyzed in terms of energy-based model. Lmba can be considered as one of feature ranking algorithm based on local-energy of samples. On the other hand, in order to improve its stability, an ensemble version of local energy-based feature ranking is proposed based on the recognition that ensemble learning is very effective for stability improvement. Some experiments are conducted on real-world data sets to show the higher stability of ensemble results than the single one.