很多聚类算法要求用户在聚类之前给出聚类数目,这给用户带来了很大的困难。利用二分思想递归分裂簇内相似度大于给定阈值的簇,最后合并簇间相似度小于给定阈值的簇,来获得最终聚类数目。实验表明提出的算法确定的聚类数目和实际聚类数目相同,并且簇内数据的相似性高,簇间数据的相似性低,该算法简单高效。
Many clustering algorithm request users to identify the number of clusters before cluster data.This is very difficult for users.In this paper,clusters which are bigger than intra similarity threshold value are split repeatedly.At last,the clusters which are smaller than inter similarity threshold value are merged to have the final number of clusters.Experiments show that the number of clusters identified by the algorithm is equal to the natural number of clusters,and the intra similarity is high,the inter similarity is low,so the algorithm is easy and efficient.