现有的领域白适应方法在定义领域问分布距离时,通常仅从领域样本的整体分布上考虑,而未对带类标签的领域样本分布分别进行考虑,从而在一些具有非平衡数据集的应用领域上表现出一定的局限性.对此,在充分考虑源领域样本类信息的基础上,基于结构风险最小化模型,提出了基于类分布的领域自适应支持向量机(DomainadaptationsupportvectormachinebasedOilclassdistribution,CDASVM),并将其拓展为可处理多源问题的多源领域自适应支持向量机(CDASVMfrommultiplesources,MSCDASVM),在人造和真实的非平衡数据集上的实验结果表明,所提方法具有优化或可比较的模式分类性能.
Current domain adaptation methods almost focus on the whole domain sample's distribution and ignore the sample's label information when they consider the distribution discrepancy between source domain and target domain. So, these methods may not work well on imbalanced datasets. In the paper, we employ the proposed distribution discrepancy which considers the sample's label information in the source domain and then propose a novel domain adaptation learning method based on the structure risk minimization principle, called support vector machine for domain adaptation based on class distribution (CDASVM). Accordingly, the CDASVM is extended to MSCDASVM (CDASVM from multiple sources) which can be used to deal with the domain adaptation problem from multiple sources. Experimental results on artificial and real imbalanced datasets show that the proposed machines CDASVM and MSCDASVM outperform or are comparable to the related domain adaption methods.