针对支持向量机对噪声和孤立点非常敏感,以及对大规模且交错严重的训练集支持向量个数多,分类速度慢和精度低等问题,基于KNN方法提出KNN-SVM分类器。首先在特征空间中,根据每个样本K个近邻中同类别样本数目的多少来删减样本集,然后对新样本集进行SVM训练;又证明了当取高斯核函数或指数核函数时,上述删减方法可简化为在原空间中进行。该方法减少了由噪声和孤立点以及一些对分类面贡献不大的样本所带给训练器的负担,减少了支持向量的个数,从而与SVM相比,加快了训练和测试速度,提高了分类精度。仿真实验表明KNN-SVM具有上述优势,而且比NN-SVM更能合理地删减样本集,达到更高的分类精度。
Since support vector machine is very sensitive to outliers and noises, has many support vectors and a low classification speed for large scale training set, a novel classifier KNN - SVM is proposed based on KNN. Firstly, it prunes the training set in the feature space, according to the ratio of the same class labels to k nearest neighbors of each sample, then trains the new set with SVM. And it is proved that the pruning can be simplified to be done in the input space, for Gauss or Exponential kernel function. This strategy decreases the training burden resulted from noises, outliers and some samples which have little effect on the classifying plane. Compared with SVM, KNN - SVM trains and classifies faster and improves the generalization ability. Numerical simulations show that it has advantages above mentioned and cuts the training set more reasonably with a higher classification accuracy than NN - SVM.