聚类是数据挖掘领域中的一个重要研究方向,在基于密度的聚类算法DBSCAN的基础上,提出了一种改进的基于密度的聚类算法,该算法在核心点的邻域扩展中不再将邻域内的点作为种子点,而是按顺序选择一个邻域外未被标记的点作为种子点,然后分不同情况进行相应的聚类扩展,此算法可以有效减少聚类中核心点邻域重叠区域查询的次数和运行的时间,实验测试结果也表明该算法聚类的效率和质量明显优于DBSCAN算法.
Clustering is an important field of research for data mining,an improved density-based clustering algorithm was presented based on DBSCAN.This algorithm chooses unlabelled points outside a core object neighborhood but not inside the neighborhood as seeds to expand clusters.Then,according to the different conditions for the corresponding clustering expand so that execution frequency of region queries could be decreased,and consequently the time cost was diminished.Experimental results also show that the time efficiency and the clustering quality of the new algorithm are greatly superior to those of the original DBSCAN.