提出一种结合单类学习器和集成学习优点的Ensemble one—class半监督学习算法.该算法首先为少量有标识数据中的两类数据分别建立两个单类分类器.然后用建立好的两个单类分类器共同对无标识样本进行识别,利用已识别的无标识样本对已建立的两个分类面进行调整、优化.最终被识别出来的无标识数据和有标识数据集合在一起训练一个基分类器,多个基分类器集成在一起对测试样本的测试结果进行投票.在5个UCI数据集上进行实验表明,该算法与tri—training算法相比平均识别精度提高4.5%,与仅采用纯有标识数据的单类分类器相比,平均识别精度提高8.9%.从实验结果可以看出,该算法在解决半监督问题上是有效的.
A semi-supervised learning algorithm is proposed based on one-class classification. Firstly, one-class classifications are built respectively for each class of data on labeled dataset. Then, some unlabeled data are tested by these one-class classifications. The classification results are used to adjust and optimize two classification surfaces. All labeled data and some recognized unlabeled data are used to train a base classifier. According to the classifying results of the base classifiers, the label of the test sample is determined. Experimental results on UCI datasets illustrate that the average detection precision of the proposed algorithm is 4.5% higher than that of the tri-training algorithm and 8.9% higher than that of the classifier trained by pure labeled data.