随着数据规模的不断扩大,稀疏子空间聚类问题面临计算上的巨大挑战。现有稀疏子空间聚类算法如交替方向乘子法(ADMM)往往基于串行实现,难以利用多核处理器提高处理大规模聚类问题的效率。针对这个问题,提出一种基于坐标下降的并行稀疏子空间聚类方法。该方法利用稀疏子空间聚类可以建模为求解一系列的样本自稀疏表达子问题的特点,使用坐标下降方法来求解每个子问题,具有参数少、收敛快的优点;同时结合自稀疏表达子问题独立的特点,在处理器的各个核心上同时求解不同样本对应的子问题,因此可以充分利用计算机资源,减少运行时间开销。在模拟数据和运动分割数据集Hopkins-155上与常用的ADMM算法进行对比实验,结果表明该算法在多核处理器上可以显著提升运行速度且聚类精度与ADMM相当。
Since the rapidly increasing data scale imposes a great computational challenge to the problem of Sparse Subspace Clustering( SSC), the existing optimization algorithms e. g. ADMM( Alternating Direction Method of Multipliers) for SSC are implemented in a sequential way which is unable to make use of multi-core processors to improve computational efficiency. To address this issue, a parallel SSC based on coordinate descent was proposed,inspired by a simple observation that the SSC can be formulated as a sequence of sample based sparse self-expression sub-problems. The proposed algorithm solves individual sub-problems by using a coordinate descent algorithm with fewer parameters and fast convergence. Based on the fact that the self-expression sub-problems are independent, a strategy was adopted to solve these sub-problems simultaneously on different processor cores, which brings the benefits of low computer resource consumption and fast running speed, it means that that the proposed algorithm is suitable for large scale clustering. Experiments on simulated data and Hopkins-155 motion segmentation dataset demonstrate that the proposed parallel SSC method on multi-core processors significantly improves the computational efficiency and ensures the accuracy when compared with ADMM.