现有的对多维数据进行聚类的常用聚类算法,通常需要事先给定聚类数k。但在大多数情况下,聚类数k事先无法确定,因此需要对最佳聚类数k进行优化处理。采用基于微粒群算法的聚类算法。为了解决微粒群聚类算法无法确定聚类数k的现象,通过k均值算法的引入,实现最佳聚类数k的求解和聚类有效性函数的构造,试验证明引入类间距离的聚类有效性检测函数对最佳聚类数判别科学,同时由于检测函数中类间距离权重的引入使该检测函数可以更好地应用于现实数据分析。
The existing common clustering algorithms of multi-dimensional data usually require giving the number of clusters k in advance.However,in most cases,the number of clusters k can not be determined in advance,so the best number of clusters k needs to be optimized.Use the clustering algorithm based on particle swarm optimization.In order to solve that the clustering algorithm based on PSO can not determine the number of clusters k,by the k-means algorithm,achieve the best number of cluster k and the structuring of the cluster validity function.The testing has proved the effectiveness of cluster detection function to determine the best number of clusters,and because of the introduction of the weights of classes,the detection function can be better applied to real data analysis.