属性选择是数据挖掘领域中数据预处理的一个重要方法.文中提出一种融合离散型萤火虫群优化算法(DGSO)与分形维数的属性选择方法.该方法以分形维数作为属性子集的评估度量准则,以DGSO作为搜索策略.为分析该方法的可行性和有效性,采用6个UCI数据集进行实验.结合10-fold交叉验证和SVM对属性选择前后的分类准确率进行分析,并进行搜索策略和评估度量准则间的性能对比及详细的参数分析.结果表明该方法具有较高的可行性和有效性.
Attribute selection is an important method of data preprocessing in the field of data mining. An improved attribute selection method is proposed which combines discrete glowworm swarm optimization (DGSO) algorithm with fractal dimension. In this method, fractal dimension is taken as the evaluation criteria for attribute subsets and DGSO algorithm as a kind of search strategy. To analyze the feasibility and the effectiveness of the proposed method, six UCI datasets are used in the experiments, and the 10-fold cross validation and support vector machine algorithm are utilized to evaluate the classification accuracy before and after attribute selection. Then, different evaluation criteria and search strategies are compared and the parameters are analyzed in detail. The experimental results show that the proposed method has comparatively high feasibility and effectiveness.