Kmeans是最典型的聚类算法,因其简洁、快速而被广泛使用。针对传统Kmeans算法对初始聚类中心敏感和聚类参数k难以确定的问题,提出了一种基于关联图划分的Kmeans算法。该算法能够有效地根据数据的分布特性选取初始聚类中心,能够在指定的数据密集程度下自适应确定聚类数目。有效性实验表明上述改进的Kmeans算法具有较高的准确率和稳定性。
Kmeans is the most typical clustering algorithm, which is widely used because it is concise, fast. As the traditional Kmeans is sensitive to initial clustering centers and the value of clustering parameter k is difficult to establish, this paper proposes an algorithm based on the partition of correlational graph. The algorithm can select initial clustering centers globally according to the distribution characteristics of the given data; the algorithm can determine the number of cluster automatically according to intensive degree of the given data. Effective experiments show that the algorithm has great accuracy and stability.