针对非平衡数据分类问题,提出了一种改进的SVM-KNN分类算法,在此基础上设计了一种集成学习模型。该模型采用限数采样方法对多数类样本进行分割,将分割后的多数类子簇与少数类样本重新组合,利用改进的SVM-KNN分别训练,得到多个基本分类器,对各个基本分类器进行组合。采用该模型对UCI数据集进行实验,结果显示该模型对于非平衡数据分类有较好的效果。
For the issue of classification on imbalanced datasets,this paper presents an improved SVM-KNN classification algorithm.On this basis,an ensemble learning model is proposed.This model employs limited sampling to segment the majority class samples,re-combines the subset of majority class samples with the minority class samples,obtains several basic classifiers by training the combined subset based on improved SVM-KNN.These basic classifiers are integrated.Experimental results on UCI dataset show that this ensemble learning model has satisfactory performance when dealing with issue of classification on imbalanced datasets.