提出CF-WFCM算法,该算法分为属性权重学习算法和聚类算法两部分.属性权重学习算法,从数据自身的相似性出发,通过梯度递减算法极小化属性评价函数CFuzziness(ω),为每个属性赋予一个权重.将属性权重应用于Fuzzy C Mean聚类算法,得到CF-WFCM算法的聚类算法.CF-WFCM算法强化重要属性在聚类过程中的作用,消减冗余属性的作用,从而改善聚类的效果.我们选取了部分UCI数据库进行实验,实验结果证明:CF-WFCM算法的聚类结果优于FCM算法的聚类结果.函数CFuzziness(ω)不仅可以评价属性的重要性,而且可以评价属性评价函数的优劣.实验说明了这一问题.最后我们对CF-WFCM算法进行了讨论.
This paper proposes CF-WFCM algorithm including feature weight learning algorithm and clustering algorithm. According to data's similarity, feature weight learning algorithm gives each feature a feature weight by minimizing the feature evaluation index CFuzziness(ω) through gradient descent technique. When the feature weight is applied in the Fuzzy C Mean (FCM) clustering algorithm, it forms the clustering algorithm of CF-WFCM algorithm. CF-WFCM emphasizes the important feature's effect and lessens the redundant feature's effect in the procedure of clustering so that the performance of clustering has been improved. Experiments on some UCI databases show that the result of CF-WFCM is better than that of FCM. In addition, the index CFuzziness(w) not only can be used to learn feature weight, but also is a valid entropy function to evaluate the feature evaluation indexes. If we can choose a better validity index to learn the feature weight before clustering, large computation will be avoided, which is showed in an example. In the end, the authors discuss the CF-WFCM algorithm.