谱聚类是一种极具竞争力的聚类算法.相似度定义对谱聚类算法的性能有至关重要的影响.本文用两点的共享近邻数目表征局部密度,从而获知隐含的簇结构信息.将这一信息与自调节的高斯核函数结合,提出了基于共享近邻的自适应相似度及相应的谱聚类算法.它满足聚类假设的要求,具有局部密度的自适应性,能有效识别数据点之间的内在联系.典型人工和真实数据集上的实验结果证明了算法的有效性.
Spectral clustering has become one of the most popular modem clustering algorithms in recent years. Similarity measurement is crucial to the performance of spectral clustering. Through exploiting the information about local density embedded in the shared nearest neighbors, a novel similarity measure and its corresponding spectral clustering, namely adaptive spectral clustering based on shared nearest neighbors is proposed in this paper. The proposed similarity measure satisfies the clustering assumption , and can obtain different values with respect to different local densities. So it can detect the intrinsic structure of the cluster embedded in the data sets more accurately. Experimental results on both synthetic and real data sets show that it's an effective and feasible way to improve the performance of spectral clustering.