针对现有基于几何结构的聚类有效性指标不能有效解决不同结构数据的聚类结果评价问题,提出了一种使用分类对聚类结果进行评价的方法。该方法把聚类得到的对象类标志作为分类问题的已知类标志,使用交叉验证法对数据集重新分类,通过对比聚类结果与分类结果之间的差异来衡量聚类有效性。一个易于聚类的数据集的结构意味着也容易进行分类,对模拟数据和真实数据的实验和分析验证了该方法的可行性和有效性。
Clustering validation is a key factor to the success of clustering.One of the approaches to validate the clustering results is clustering validation index based on geometric structure.However,there is no general geometric index for all kinds of data structures.This paper proposed a classification based method for clustering validation recently,which used the labels of clustering for classification.Experimental results and analysis on both synthetic and real data show the effectiveness of the proposed method.