入侵检测日志数据具有数据量大、特征数目繁多以及连续型属性多的特点.传统的特征选择方法在处理连续型数据时要先进行离散化,这需要花费大量的预处理时间并且离散化过程可能会丢失一些重要信息,导致分类精度下降.针对上述问题,首先引入能直接处理连续型数据的邻域粗糙集约简模型,在此基础上构造计算粒子群优化算法中粒子的适应度函数,最后给出一种基于邻域粗糙集模型和粒子群优化的特征选择算法.仿真实验结果表明该算法可以选择较少的特征,改善分类的能力.
Feature selection, also called attribute selection, is the process of selecting a subset of features from a relatively large dataset. Feature selection is efficient in removing redundant and noisy features and the selected features are effective to describe the whole dataset. Since there are many features in intrusion detection data, which is large in quantity, feature selection plays an important role in intrusion detection. Within these features, many are numerical features. The traditional feature selection methods, such as the correlation analysis, information gain (IG), support vector machine (SVM) and rough set, must discretize numerical features when dealing with mixed features. The discretization process usually consumes much time and sometimes will even result in the loss of important information, decreasing the classification accuracy. Aiming at solving these problems, the neighborhood rough set reduction model is employed in this paper, which can process the numerical features directly without discretization. Then the particle fitness function in particle swarm optimization (PSO) algorithm is built based on that model. Finally, a novel feature selection algorithm based on particle swarm optimization and neighborhood rough set reduction model is proposed. Experimental results prove that the new algorithm improves classification ability with fewer features selected.