针对K-means算法所存在的问题,提出一种优化初始中心点的算法.采用密度敏感的相似性度量来计算对象的密度,启发式地生成样本初始中心.然后设计一种评价函数——均衡化函数,并以均衡化函数为准则自动生成聚类数目.与传统算法相比,本文算法可得到较高质量的初始中心和较稳定的聚类结果.实验结果表明该算法的有效性和可行性.
Aiming at the problems of K-means algorithm, a method is proposed to optimize the initial center points through computing the density of objects. Thus, the initial center of the samples can be built in a heuristic way. Then, a new evaluation function is proposed, namely equalization function, and consequently the cluster number is generated automatically. Compared with the traditional algorithms, the proposed algorithm can get initial centers with higher quality and steadier cluster results. Experimental results show the effectiveness and feasibility of the proposed algorithm.