针对数据竞争聚类算法在处理复杂结构数据集时聚类性能不佳的问题,提出了一种密度敏感的数据竞争聚类算法。首先,在密度敏感距离测度的基础上定义了局部距离,以描述数据分布的局部一致性;其次,在局部距离的基础上计算出数据间的全局距离,用来描述数据分布的全局一致性,挖掘数据的空间分布信息,以弥补欧氏距离描述数据分布全局一致性能力不佳的缺陷;最后,将全局距离用于数据竞争聚类算法中。将新算法与基于欧氏距离的数据竞争聚类算法进行性能比较,在人工数据集和真实数据集上的实验结果表明,该算法克服了数据竞争聚类算法难以处理复杂结构数据的缺点,聚类结果具有更高的准确率。
Since the clustering by data competition algorithm has poor performance on complex datasets, a density- sensitive clustering by data competition algorithm was proposed. Firstly, the local distance was defined based on density- sensitive distance measure to describe the local consistency of data distribution. Secondly, the global distance was calculated based on local distance to describe the global consistency of data distribution and dig the information of data space distribution, which can make up for the defect of Euclidean distance on describing the global consistency of data distribution. Finally, the global distance was used in clustering by data competition algorithm. Using synthetic and real life datasets, the comparison experiments were conducted on the proposed algorithm and the original clustering by data competition based on Euclidean distance. The simulation results show that the proposed algorithm can obtain better performance in clustering accuracy rate and overcome the defect that clustering by data competition algorithm is difficult to handle complex datasets.