针对支持向量数据描述(Support Vector Data Description,SVDD)的训练集中同时含有正常点和离群点的问题,为降低离群点对SVDD训练模型的不利影响,提出了一种基于单簇核可能性C-均值的SVDD离群点检测算法.本文算法通过单簇核聚类获得每个样本属于正常类的隶属度,将其作为每个样本属于目标类的置信度.将样本置信度引入到SVDD训练模型中,减弱低置信度样本在建立决策边界中的作用.实验表明,与已有的相关方法相比,本文方法能够显著改善SVDD的离群点检测效果.
In order to reduce the negative influence of outliers on the model of support vector data description (SVDD) when the training dataset contains both normal samples and outliers which are all labeled as target class, a one- cluster kernel possibilistic C-means based SVDD method for outlier detection is proposed. In this paper, each sample of the training dataset is assigned a confidence level based on the membership degree of each sample belonging to the normal class, which is obtained through the one-cluster kernel PCM clustering. The proposed algorithm incorporates the confidence levels into the training model to reduce the importance of the samples which have less confidence levels. The experimental results show that the proposal significantly improves the effect of outlier detection, compared with the existing SVDD-based outlier detection methods.