提出了一种高光谱遥感图像半监督分类算法DE-self-training。利用少量标记样本作为初始训练集,基于改进的Self-training算法构建初始分类器,对未标记样本进行预测;然后从分类结果中按一定比例随机选取部分样本,连同其类别标记一起加入训练集中,再用扩大的训练集重新训练分类器,并对剩余的未标记样本进行预测。如此迭代地进行训练-预测-挑选样本扩大训练集过程。同时,在迭代训练过程中,运用基于最近邻域规则的数据剪辑策略对扩大训练集时产生的误标记样本进行过滤,以保证训练集的质量,不断迭代地训练出更精确的分类器,最终使所有未标记样本都获得类别标记。以AVIRIS Indian Pines和Hyperion EO-1 Botswana作为实验数据对DE-self-training算法进行测试,并与基于支持向量机的分类结果作比对。实验表明,DE-self-training算法可以在标记样本数量有限条件下,充分挖掘未标记样本的有用信息,使总体分类精度和Kappa系数都有不同程度的提高。
A semi-supervised classification algorithm named DE-self-training for hyperspectral remote sensing images was proposed. Firstly, taking a few labeled samples as initial training set, the initial classification model was constructed by using improved Self-training algorithm to classify unlabeled samples. Then, partial samples and corresponding labels were selected randomly as a proportion from classification results into training set, and the augmented training set was used to retrain the model to classify the unlabeled samples. Then, the algorithm continued the process of training-classifying-picking out samples to augment training set iteratively. During this process, in order to ensure the training set' s quality and the correct labeling of new increased samples, the algorithm edited and purified mislabeled samples by using data editing strategy based on the nearest neighbor rule. Finally, the proposed algorithm trained classification model iteratively to get a more accurate result until the unlabeled samples set was empty. In the experiments, AVIRIS Indian Pines and Hyperion EO - 1 Botswana data were used to test the algorithm. According to the comparison with SVM classification results, accuracy and Kappa coefficients by utilizing unlabeled samples the DE-self-training algorithm can get higher information under limited labeled samples.