半监督学习和主动学习是机器学习的两个重要研究领域.半监督学习通过利用有标记样本训练分类器标注未标记样本,来增加标记样本的数量.那么,如果未标记样本被错误标注将会影响后续分类器的迭代训练,进而降低最终分类器的预测精度.因此,本文在半监督学习的基础上引入主动学习的思想,首先采用MPWPS算法选取最有可能预测错误的样本,交由专家进行标注,再结合已标记样本进行迭代协同训练,来提高分类器的性能和标注的正确率.本文实现了基于MPWPS主动学习的半监督协同分类算法,并在UCI数据集上的实验验证了该算法的有效性.
Semi - supervised learning and active learning are two important areas of research in machine learn- ing. Semi - supervised learning trains the classifier by using labeled samples to label the unlabeled samples for in- creasing the number of labeled samples. So,if the unlabeled sample was incorrectly labeled, it will affect the iterative training of the subsequent classification, thereby reducing the prediction accuracy of the final classification. This article leads into the idea of active learning based on the semi - supervised learning. First, it uses MPWPS algorithm to select samples which is forecasted most likely wrong, labeled by experts, then it does the iterative co - training combined with labeled samples, to improve the performance of classifier and labeling accuracy. This article imple- ments a semi - supervised co - training classification algorithm based on MPWPS active learning algorithm, based on this senai - supervised active learning MPWPS classification algorithm, and experiments on UCI data sets demonstrate the effectiveness of the algorithm.