提出了一种基于两阶段学习的半监督支持向量机(semi-supervised SVM)分类算法.首先使用基于图的标签传递算法给未标识样本赋予初始伪标识,并利用k近邻图将可能的噪声样本点识别出来并剔除;然后将去噪处理后的样本集视为已标识样本集输入到支持向量机(SVM)中,使得SVM在训练时能兼顾整个样本集的信息,从而提高SVM的分类准确率.实验结果证明,同其它半监督学习算法相比较,本文算法在标识的训练样本较少的情况下,分类性能有所提高且具有较高的可靠性.
A semi-supervised support vector machine (semi-supervised SVM) classification algorithm is proposed based on two-stage learning. A graph-based label propagation algorithm is used to provide initial pseudo labels for the unlabeled samples. And k-nearest graph is applied to distinguishing and removing the possible noisy samples. Then the denoised samples are inputted into the support vector machine (SVM) as labeled samples, so that the global information of the whole samples can be utilized by SVM when it is used in the training to improve the classification accuracy. The experiment results show that compared with other semi-supervised learning algorithms, the proposed method improves classification performance and is of higher robustness in the case of fewer labeled training samples.