针对DBSCAN算法无法处理变化密度的问题,提出一种基于变化密度的自适应空间聚类方法.采用密度变化率来识别不同密度的簇之间的边界,且运行时自动调整参数的值.将密度定义为一个点到其第k个最近邻居的距离,若一个点的邻居的密度与该点密度的变化率小于用户给定阈值,则为相似邻居.定义核点为最邻近邻居中至少有k个是相似邻居的点,在此基础上应用DBSCAN算法进行广度优先搜索,将密度相似并且距离可达的核点及其最邻近邻居标记为同一个簇.在判断相似邻居时,根据已加入的核点的平均密度和密度变化率自动调整参数值.实验结果表明,该方法可以准确地发现任意形状、大小和密度的簇,消除孤立点,且通过自适应机制更容易设置合适参数.
Aiming at the problem that DBSCAN can not find clusters of varied densities and is sensitive to parameters,this paper proposes a self-adaptive spatial clustering method based on varied density.The algorithm uses the change rate of density to find the boundaries between clusters with different densities,and self-adjust the values of parameters.Specifically,it defines one point' s density as the distance from itself to its k Nearest Neighbor (kNN).If the density change rate of a point and one of its nearest neighbors is less than the threshold given by the user,the neighbor is called similar neighbor.The paper redefines core point as point which has at least k similar neighbors in its nearest neighbors.Based on these modifications,it uses DBSCAN to breadth first search,and marks the connected core points as well as their nearest neighbors as the same cluster.In addition,the algorithm automatically adjusts the values of the parameters at runtime according to the average densities and density change rate of the marked core points.Experimental results show that the improved method can find clusters of arbitrary shape,size and density,and eliminate outliers.Besides,with the self-adaptive,setting parameters is easier than other algorithms.