构造性机器学习(cML)算法在训练分类器时需要大量有标记样本,而获取这些有标记样本十分困难。为此,提出一种基于Tri-training算法的构造性学习方法。根据已标记的样本,采用不同策略构造3个差异较大的初始覆盖分类网络,用于对未标记数据进行标记,再将B标记数据加入到训练样本中,调整各分类网络参数,反复进行上述过程,直至获得稳定的分类器。实验结果证明,与CML算法和基于NB分类器的半监督学习算法相比,该方法的分类准确率更高。
Constructive Machine Learning(CML) algorithm needs larger numbers of labeled examples to train a classification network, but it is difficult to obtain a mass of labeled examples. So this paper proposes a constructive learning method based on Tri-training algorithm. According to the labeled examples, it constructs three initial classification networks by using different strategies with lager differences. Unlabeled examples can be labeled by using the initial classification networks, so that the examples can be joined into the labeled examples and the parameters of the classification network can be rectified. The steps are repeated to increase the labeled samples until a steady classifier is trained. Experimental results show that the algorithm is feasible and effective than CML and semi-supervised learning algorithm based on Naive Bayes(NB) classifier.