针对特征选择中Filter与Wrapper方法分别存在的问题,本文提出了一种新的基于分类互补性分析的特征选择算法.该方法将Filter方法与Wrapper方法结合起来.先根据ReliefF评估和对称不确定性评估去除不相关特征,再使用对称不确定性评估去除冗余特征,最后使用基于分类互补性分析的Wrapper特征选择算法选出最后的目标子集.实验表明该算法结合了Filter与Wrapper两者的优点,具备了高准确性,同时可以减少时间开销.文章最后在数字乳腺图像肿块的检测中应用了该算法,得到了良好的效果.
A novel feature selection algorithm based on classification complementarity is proposed in this paper, in order to atone for the shortcomings of using the filter or wrapper feature selection approach alone. The filter feature selection method can select features fast but has low accuracy, while the wrapper method can get better performance on feature selection but it costs lots of time. Thus, the proposed algorithm combines both the filter approach and the wrapper approach together. The algorithm includes two steps. In the first step, it removes the irrelevant features using ReliefF estimation and symmetric uncertainty estimation, which are two correlation measures on feature performance estimations in classical feature selection methods. Features with low correlation with class variance would be excluded as irrelevant features. Then symmetric uncertainty estimation is used to remove the redundant features. Features with high symmetric uncertainty to each other means there exists redundancy between them and the worst of them should be excluded. In the second step, it selects the target feature subset by using a wrapper feature selection algorithm based on classification complementarity estimation. We proposed the classification complementarity concept, a new estimation to the combination of feature sets. Classification complementarity indicates whether combination of feature set could improve classification performance. By this estimation feature sets are combined together iteratively until no better feature set could be found. Experiment results indicate that the proposed algorithm has advantages of high accuracy and low time cost and is effective in practical applications.