谱聚类算法中并不是所有的顶层谱都含有聚类信息,对于实际含噪声数据的聚类,由于谱数据分布复杂,谱的选择是必要的.文中推广积分平方误差散度,验证所提出的广义积分平方误差散度可用来估计数据分布的模态,以及度量谱所含的聚类信息量,并提出一种基于谱选择的谱聚类算法.自然图像分割实验结果表明,提出的算法比以往的谱聚类算法更为简单有效.
Not all of the top eigenvectors contain clustering information for the task of real-world data clustering. Since the noise exists, the distribution of elements of an eigenvector is complex and it is necessary to select eigenvectors for spectral clustering. In this paper, the integrated squared error (ISE) divergence is generalized and the proposed generalized integrated squared error (GISE) is used to estimate the muhimodality of data distribution and measure the clustering information of eigenvector. Then, a spectral clustering algorithm based on eigenvector selection is proposed. The experimental results on varied natural images segmentation show that the proposed algorithm is simpler and more effective than pervious algorithms.