聚类分析是数据挖掘及机器学习领域内的重点问题之一。K-means聚类由于其简单实用,在聚类划分中是应用最广泛的一种方案。提出了在传统的K-means算法中初始点选取的新方案,对于K-means收敛计算时利用三角不等式,提出了加速收敛过程的改进方案。实验结果表明,改进后的新方法相对于传统K-means聚类所求的结果有较好的聚类划分。
Clustering analysis is one ofthe important problems in the fields of data mining and machine learning. Among these clustering methods, K-means is one of the most popular schemes owing to its simple and practicality. The standard K-means clustering is investigated and an improved algorithm is given by selecting the initial centers and accelerating the process of convergence. Experiments show that the new algorithm is more effective and can get a better result than the standard K-means clustering both in the cost and running time.