提出一种新的基于半监督的SVM—KNN分类方法,当可用的训练样本较少时,使用SVM进行分类,不能得到准确的分类边界,本文采用半监督学习策略从大量未标记样本中提取边界向量来改善SVM-KNN分类器的引进不仅扩充了SVM的训练样本数目,而且优化了迭代过程中训练样本的标记质量,可不断修复SVM的分类边界.实验结果表明,所提出的方法能提高SVM算法的分类精度,通过调整参数能够获得更好的分类效果,同时也减小了标记大量未标记样本的代价.
In this paper a novel SVM-KNN classification methodology based on semi-supervised learning is proposed, we consider the problem of using a large number of unlabeled data to boost performance of the classifier when only a small set of labeled examples is available. We use the few labeled date to train a weaker SVM classifier and make use of the boundary vectors to improve the weaker SVM iteratively by introducing KNN. Using KNN classifier doesn't enlarge the number of training examples only, but also improves the quality of the new training examples which are transformed from the boundary vectors. Experiments on UCI data sets show that the proposed methodology can evidently improve the accuracy of the final SVM classifier by tuning the parameters and it reduces the cost of labeling unlabeled examples.