半监督聚类利用少量标记样本的辅助信息来引导对大量无标记数据的分割。Pedrycz提出的半监督FCM(sFCM)算法应用标记样本的类别归属信息来辅助聚类,其在标记点过于稀少时会退化为无监督FCM算法且收敛较慢,难以应用于多数实际问题。在半监督FCM的基础上提出一种改进退化的半监督FCM算法(dsFCM),通过在sFCM迭代过程中设置监督成分的比重,来加大标记样本点对聚类中心的影响力,在聚类精度、速度和鲁棒性上均比半监督FCM有所提高,解决了标记点稀疏时的退化问题,在医学图像分割上取得了良好应用。
Traditional clustering algorithms are always viewed as unsupervised methods for data grouping to extract information of interest from unlabeled data, while semi-supervised clustering employs limited amount of labeled data to aid the unsupervised grouping of mass unlabeled data. Pedrycz provided a semi-supervised Fuzzy C-Means algorithm (sFCM) to incorporate supervised information of labeled data as an additive part of objective function in the Fuzzy C-Means algorithm (FCM). This paper proposes a novel algorithm called Degeneracy-Improved Semi-Supervised Fuzzy C-Means algorithm (dsFCM) to fundamentally overcome the critical disadvantages of Pedrycz' s sFCM algorithm, i. e. , degeneracy to the classical FCM algorithm and slow convergence, particularly when applied in actual data set in which the amount of labeled points is far fewer than that of unlabeled points. Experimental results on UCI benchmark data and IBSR brain MR image data demonstrate that dsECM algorithm can outperform sFCM algorithm in accuracy, speed and robustness. Moreover, it shows that dsFCM algorithm avoids the problems of slow convergence and degeneracy to classical FCM algorithm when applied to real world data clustering with exiguous labeled data, and presents its effectiveness for the application in interactive segmentation of medical images with a small amount of labeled data points given by user.