为提高AdaBoost分类器集成算法的分类精确度并简化分类系统的复杂度,提出一种融合样本选择与特征选择的AdaBoost支持向量机集成算法(IFSelect-SVME).该算法在AdaBoost算法的每个循环中利用加权免疫克隆样本选择算法进行样本选择,并用互信息顺序向前特征选择算法进行特征选择,再利用每个循环优化选择得到的特征样本子集训练个体SVM分类器,并对其进行加权集成,生成最终的决策系统.对实验所用9组UCI数据集的仿真结果表明:与支持向量机集成(SVME)算法相比,IFSelect-SVME算法的正确分类率有所提高,且样本数可减少30.8%~80.0%,特征数可减少32.2%~81.5%,简化了集成结构,缩短了测试样本的分类时间,所得到的分类系统具有更好的分类精度.
An AdaBoost support vector machine ensemble (IFSelect-SVME) method with integration of instance selection and feature selection is proposed to improve the classification accuracy of AdaBoost ensemble algorithm and to simplify the complexity of the classification system. The proposed algorithm selects instance subsets via the weighted immune clonal instance selection algorithm in each cycle of AdaBoost algorithm, and the feature subsets are obtained using the mutual information sequential forward feature selection algorithm. Then the individual SVM classifiers are trained by the resulting optimal feature instance subsets in each cycle, and combined via majority vote to generate the final decision system. Simulation results on 9 UCI datasets and comparisons with the AdaBoost SVME algorithm show that the IFSelect-SVME algorithm obtains better classification accuracy with the number of instances decreasing to 30. 8%-80.0% and the number of features decreasing to 32. 2%-81.5%. The proposed IFSelect- SVME algorithm simplifies the structure of ensemble model, shortens the classification time of test instances, and gives a classification system with better classification accuracy.