东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

一种基于密度的局部离群点检测算法DLOF

ISSN号：1000-1239
期刊名称：《计算机研究与发展》
时间：0
分类：TP301.6[自动化与计算机技术—计算机系统结构;自动化与计算机技术—计算机科学与技术] TP18[自动化与计算机技术—控制科学与工程;自动化与计算机技术—控制理论与控制工程]
作者机构：[1]南京航空航天大学信息科学与技术学院,南京210016
相关基金：国家“八六三”高技术研究发展计划基金项目（NO2007AA01Z404）; 国家自然科学基金项目（60673127）; 南京航空航天大学科研启动基金项目（S0848-042）;南京航空航天大学基本科研业务费专项科研基金项目（NS2010094）

关键词：局部离群点, 密度, 局部离群因子, 信息熵, 离群属性, local outlier, density, local outlier factor, information entropy, outlier attribute

中文摘要：

离群点可分为全局离群点和局部离群点.在很多情况下,局部离群点的挖掘比全局离群点的挖掘更有意义.提出了一种基于密度的局部离群点检测算法DLOF.该方法通过引入信息熵用于确定各对象的离群属性,在计算各对象之间的距离时采用加权距离,并给离群属性较大的权重,从而提高离群点检测的准确度.另外,该算法在计算离群因子时,采用了两步优化技术,并对采用这两步优化技术后算法的时间复杂度进行了详细分析.理论分析和实验结果表明了该方法是有效可行的.

英文摘要：

With rapid growth of data, data mining becomes more and more important. Detecting outlier is one of the very important data mining techniques, which is to find exceptional objects that deviate from the most rest of the data set. There are two kinds of outliers： global outliers and local outliers. In many scenarios, the detection of local outliers is more valuable than that of global outliers. The LOF algorithm is a very distinguished local outlier detecting algorithm, which assigns each object an outlier-degree value. However, when the outlier-degree value is calculated, the algorithm should equally consider all attributes. In fact, different attributes have different effects. The attributes with more large effects are known as outlier attributes. In this paper, a density-based local outlier detecting algorithm （DLOF） is proposed, which educes outlier attributes of each data object by information entropy. The weighted distance is introduced to calculate the distance of two data object, which those outlier attributes are assigned with bigger weight. So the algorithm improves outlier detection accuracy. In addition, when the local outlier factors are calculated, we present our two improvements of the algorithm and their time complexity analysis. Theoretical analysis and experimental results show that DLOF is efficient and effective.

同期刊论文项目