为了解决单个SVM可能产生的泛化能力恶化问题以及当SVM采用一对多组合策略解决多类分类时可能产生的误差无界情况,本文采用Bagging方法构造了一个基于SVM的多类分类集成模型,利用MIT KDD 99数据集进行仿真实验,通过实验探讨了其中的两个参数——训练样本数和单分类器个数对集成学习效果的影响,并将其与采用全部样本进行训练及部分样本进行训练的单分类器检测进行了比较。结果表明:集成学习算法能够有效降低采用全部样本进行训练所带来的计算复杂性,提高检测精度,而且也能够避免基于采样学习带来检测的不稳定性和低精度。
To overcome the deterioration of generalization ability caused by individual SVM and the problem of unbounded error begotten by using one-against-rest combination of SVM in multi-class classification, a Bagging based multi-class SVM ensemble model is constructed and applied to the MIT KDD 99 dataset to perform the simulation experiment. In the simulation experiment, the performance of SVM ensemble are evaluated by choosing the training sample number and the number of base classifiers, and then comparison with the individual classifier using all training data and using sampled training data. The result demonstrates that the Bagging based SVM ensemble algorithm can depress the complex of computation in classifier with all training data and improve the detection rate; Moreover, it can avoid the instability and the low precision in classifier with sampled training data.