为全面理解谱聚类(spectral clustering,SC)算法性能,利用四类几何结构数据,对规范化割(normalized cut,Ncut)、稀疏子空间聚类(sparse subspace clustering,SSC)和谱曲率聚类(spectral curvature clustering,SCC)三种谱聚类算法进行对比分析。结果表明,三种算法的聚类结果各有差异,但每类数据都可以找到相对最有效的聚类算法。Ncut无法处理相交的数据,适用性较差;SSC算法适用性较强,但聚类精度不高;SCC算法具有适用性强、精度高等特点,能够实现四类几何结构数据的有效聚类。此外,改进的SCC算法有效地实现了有数据间断的两条相交螺旋线聚类。最后,分析了现有SCC算法存在的不足,并指出进一步研究的方向。
In order to understand the performance of spectral clustering algorithms, this paper used four kinds of different geometric structure data to take comparative analysis of three clustering algorithms, included normalized cut, sparse subspace clustering and spectral curvature clustering. The results prove that three algorithms have different clustering effects, but each type of data can find the corresponding effective clustering algorithm. Ncut algorithm can' t deal with significantly intersecting clusters, which has a poor applicability; SSC algorithm applicability is stronger, however, the clustering accuracy is not high; SCC algorithm show strong suitability and high accuracy, which can get good clustering results. Furthermore, it proposed an improved method based on existing SCC algorithm to two spirals with missing data. Experimental results show that this algorithm can cluster the data with better performance and effectiveness. Finally, it analyzed limitations existing in available SCC algorithm, and discussed problems for further research on sparse subspace clustering.