针对离散值数据集特征选择问题,提出基于相对分类信息熵的进化特征选择算法.使用遗传算法搜索最优特征子集,使用相对分类信息熵度量特征子集的重要性.以相对分类信息熵作为适应度函数,使用二进制编码问题的解,使用赌轮方法选择产生下一代个体.实验表明文中算法在测试精度上优于其它方法,此外还从理论上证明文中算法的可行性.
Aiming at the problem of feature selection from datasets with discrete values, a feature selection approach via evolutionary computation based on relative classification information entropy is proposed. Genetic algorithm is used to search the optimal feature subset and the relative classification information entropy is employed to measure the significance of the feature subset. Specifically, the relative classification information entropy is used as fitness function, the solutions of the problems are encoded with binary number, and the next generation of individuals is produced by using roulette wheel method. The experimental results show that the proposed approach outperforms other methods in testing accuracy.Furthermore, the proposed approach is theoretically proved to be feasible.