迁移学习通过充分利用源域共享知识,实现对目标域的小样本问题求解,然而,对训练和测试样本分布差异测度仍然是该领域的主要挑战。该文针对多源迁移学习算法中,由于源域选择和源域辅助样本选择不当引起的"负迁移"问题进行研究,提出一种可迁移测度准则下的协变量偏移修正多源集成方法。首先,根据源域和目标域之间的协变量偏移原则,利用联合概率的密度估计,定义辅助样本的可迁移测度,验证目标域和源域在数据空间中标记分布的一致性。其次,在多源域选择阶段,引入非迁移判别过程,提高了源域知识的迁移准确性。最后,在Caltech 256数据集中,验证了Gist特征知识表示和迁移的有效性,分析了多种条件下的辅助样本选择和源域选择的有效性。实验结果表明所提算法可有效降低"负迁移"现象的发生,获得更好的迁移学习性能。
Transfer learning usually focuses on dealing with small training set in target domain by sharing knowledge generated from source ones, in which one main challenge is divergence metric of distributed samples between training and test data. In order to deal with "negative transfer" problem caused by improper auxiliary sample selections in source domains, this paper presents a modified covariate-shift multi-source ensemble method with transferability criterion. Firstly, transferability metric of auxiliary samples is defined by joint density estimation in accordance with co-variant transfer principles from source to target, so that the coherency of data distributions is verified. After that, whether transfer learning occurs or not should be determined after evaluating transferability metric in different sources to boost accuracy. Finally, experiments on Caltech256 using GIST demonstrate effectiveness and efficiency in the proposed approach and discussions of performance under diverse selections from auxiliary samples and source domains are presented as well. Experimental results show that the proposed method can sufficiently hold back "negative transfer" for better learnability in transfer style.