为克服k-means算法难以探测出一些局部分布稀疏不均、聚类区域的形状与大小不规整数据点集的聚类分布结构这个缺点,在半监督学习思想的指导下,针对混合属性空间区域中具有同一分布性质的带有类别标记的小样本数据集和无类别标记的大样本数据集,提出了一种基于半监督学习的k平均聚类框架。仿真实验表明:该框架经常能取得比k-means更好的聚类精度,从而说明这个半监督学习框架具有一定的有效性。
For some sparse-odd data sets with different size and shape of clusters, ordinary k-means algorithm cannot work well in exploiting the cluster-distribution.In order to conquer this shortcom-ing, under the idea of semi-supervised learning, a k-means clustering framework based on semi-su-pervised leaning is presented for an unlabeled large sample which has the same distribution with a labeled small sample in a hybrid attributes space.Simulations show that the framework can often get better clustering accuracy than k-means algorithm, validating the effectiveness of the semi-supervised learning framework to some extent.