特征选择已经是高维数据处理尤其是模式识别领域中的一个关键问题.提出一种混合特征选择模型用于从潜在的相关特征中选择那些最重要的特征.该模型包括两部分:filter部分与wrapper部分.在filter部分,4种不同的Filter方法分别对候选特征进行独立排序,在融合后进一步生成综合特征排序,综合排序随后产生遗传算法(GA)的初始种群.在wrapper部分,GA算法根据神经网络的分类准确率对个体(特征子集)进行评价,以便于搜索到最优的特征子集.测试结果表明,该模型不仅能有效地减少特征子集的大小,而且还可以进一步提高分类识别的准确率和效果.
Feature selection method has become the focus of the research in the area of high-dimensional engineering data processing,especially pattern recognition.In this paper,a hybrid feature selection model is presented to select the most significant features from all potentially relevant features.The model combines a filter with a wrapper.In the filter,four variable ranking methods are used to pre-rank the candidate features,and then an initial GA population is produced based on the degree of significance of the re-rank features.In the wrapper,GA algorithm is utilized to search the feature subsets evaluated by the classification error rate of neural network classifier,which can help find the most feature subset.Tests to some datasets demonstrate that the presented model not only can reduce dimensionality of feature subset,but also can improve the accuracy and efficiency of classification.