为了克服由错误标记样本所引发的问题,提出半监督分类器模型。从标记数据和未标记数据中学习得到决策准则,并在马尔科夫随机场中,运用一个新的基于鲁棒误差函数的能量函数,分别设计基于迭代条件模型和马尔科夫链蒙特卡罗的两种算法来推断标记样本和未标记样本的类别。实验结果表明这两种方法对于现实世界的数据集来说是高效的,并具有很好的鲁棒性。
A model of semi-supervised classification was proposed to overcome the problem induced by mislabeled sampies. A decision rule was learned from labeled and unlabeled data, and a new energy function based on robust error function was used in the Markov random field. Also two algorithms based on the iterative condition mode and the Markov chain Monte Carlo were designed to infer the label of both labeled and unlabeled samples. Experimental results demonstrated that the proposed methods were efficient for a real-world dataset.