分析了噪声对半监督学习Gaussian-Laplacian正则化(Gaussian-Laplacian regularized,简称GLR)框架的影响,针对最小二乘准则对噪声敏感的特点,结合信息论的最大相关熵准则(maximum correntropy criterion,简称MCC),提出了一种基于最大相关熵准则的鲁棒半监督学习算法(简称GLR-MCC),并证明了算法的收敛性.半二次优化技术被用来求解相关熵目标函数,在每次迭代中,复杂的信息论优化问题被简化为标准的半监督学习问题.典型机器学习数据集上的仿真实验结果表明,在标签噪声和遮挡噪声的情况下,该算法能够有效地提高半监督学习算法性能.
This paper analyzes the problem of sensitivity to noise in the mean square criterion of Gaussian- Laplacian regularized (GLR) algorithm. A robust semi-supervised learning algorithm based on maximum correntropy criterion (MCC), called GLR-MCC, is proposed to improve the robustness of GLR along with its convergence analysis. The half quadratic optimization technique is used to simplify the correntropy optimization problem to a standard semi-supervised problem in each iteration. Experimental results on typical machine learning data sets show that the proposed GLR-MCC can effectively improve the robustness of mislabeling noise and occlusion as compared with related semi-supervised learning algorithms.