针对经典谱聚类算法无法自动确定数据类个数的问题,本文提出了一种基于本征间隙与正交特征向量的自动谱聚类算法.该方法利用样本数据构建亲和度矩阵,然后进行谱分解得到相应的特征值和特征向量,对特征值从大至小依次排序,用本征间隙来刻画相邻特征值之间的差,通过第一个极大本征间隙出现的位置来自动确定类个数,最后以特征向量之间的夹角作为相似度和已获得的类个数相结合来实现数据分类.本文算法的正确性在人造数据库上得到了验证,并在UCI数据库上与k-means、FCM、Jordan算法进行了分类准确性比较实验,结果表明本文方法比其他三种方法的分类准确率更高.
To deal with the problem that classical spectral clustering methods can not automatically determine the number of class.A new algorithm called automatic spectral clustering(ASC) based on eigengap and orthogonal eigenvector was presented in this paper.The proposed method first constructed the affinity matrix of data,and gained series of eigenvalues and eigenvectors through spectral decomposition.Second,ordered the eigenvalues and used the first maximum eigengap to determine the number of classes.The data was classified by the class number and the angle between two eigenvectors as similarity.The effectiveness of the proposed algorithm was verified on artificial data,and was compared with k-means,FCM and Jordan algorithm on UCI database.The experiment results demonstrate that the proposed method ASC outperforms other three methods in respect of classification accuracy.