与数据和噪音的存在的固有的稀少,聚类的高维的数据是为聚类算法的严肃的挑战。一个新线性歧管的聚类方法被建议处理这个问题。基本想法是寻找线歧管的在数据集隐藏的簇,然后保险丝一些线歧管构造的簇更高维歧管簇。直角的距离和正切距离是一起被考虑线性歧管距离度量标准。空间邻居信息充分被利用构造歧管的原来的线并且在线期间优化线 manifolds 歧管簇寻找过程。在真实、合成的数据集合上从实验获得的结果在一些竞争以精确性和计算时间聚类方法上表明建议方法的优势。建议方法能高获得与不同尺寸为各种各样的数据集合聚类精确性,歧管尺寸和噪音比率,它为高维的数据证实反噪音能力和建议方法的高聚类精确性。
High dimensional data clustering, with the inherent sparsity of data and the existence of noise, is a serious challenge for clustering algorithms. A new linear manifold clustering method was proposed to address this problem. The basic idea was to search the line manifold clusters hidden in datasets, and then fuse some of the line manifold clusters to construct higher dimensional manifold clusters. The orthogonal distance and the tangent distance were considered together as the linear manifold distance metrics. Spatial neighbor information was fully utilized to construct the original line manifold and optimize line manifolds during the line manifold cluster searching procedure. The results obtained from experiments over real and synthetic data sets demonstrate the superiority of the proposed method over some competing clustering methods in terms of accuracy and computation time. The proposed method is able to obtain high clustering accuracy for various data sets with different sizes, manifold dimensions and noise ratios, which confirms the anti-noise capability and high clustering accuracy of the proposed method for high dimensional data.