谱聚类中k个最大特征值对应的特征向量不一定使聚类结果达到最好,因此,文中采用特征向量组的选择性集成方法以提高谱聚类性能,涉及基特征向量组的选取、选择性集成策略等问题。利用训练数据的成对约束信息进行打分,选出较好的基特征向量组;应用测试数据在训练数据中的l-最近邻的聚类性能指标,动态评价每组特征向量,选出少量几个参与投票的特征向量组;对测试数据集的几个特征向量组数据进行谱聚类,并对结果进行簇配准,给出最终的聚类结果。实验表明,采用动态选择性集成方法能提高测试数据的聚类性能。
Since the corresponding eigenvectors of k maximum eigenvalues do not always achieve the optimal clustering results, the clustering performance is improved by selective integrated approach for eigenvector groups involving the selection of base eigenvector group and selective integration strategy. Constraint score is used to evaluate eigenvectors by the pair-wise constraint information of training data, and some prefera-ble base eigenvector groups are obtained. For each testing data, the clustering accuracy of l-nearest neighbors from training dataset are used to dynamically evaluate eigenvector groups, and several accurate eigenvector groups are selected to vote. To test the obtained eigenvector groups, spectral clustering is carried out on the corresponding eigenvectors of testing dataset. The clustering results are aligned and the final experimental results are obtained. The experimental results on UCI benchmark datasets show that the proposed algorithm improves the clustering performance of testing data.