提出了一种基于主分量分析和属性距离和的孤立点检测算法。该方法首先通过主分量分析方法从众多属性中提取出满足累计贡献率的主分量,同时利用PCA变换矩阵把原始数据集转换到由主分量组成的新的特征空间上,之后对转换后的数据集用属性距离和的方法对孤立点进行检测。实验结果证明了基于主分量分析和属性距离和的孤立点检测算法的有效性。
An outlier detection algorithm based on principal component analysis and the sum of attributes distance is proposed. The algorithm firstly extracts the principal components from many attributes satisfying accumulative contribution rate.Simultaneously,by the PCA matrix original dataset is transformed to a new feature space composed of principal component.Then outliers are detected using the approach of the sum of attributes distance in the transformed datasets.The results of the experiment show that the outlier detection algorithm based on principal component analysis and the sum of attributes distance is effective.