位置:成果数据库 > 期刊 > 期刊详情页
基于率失真理论的模糊聚类模型与算法
  • 期刊名称:情报学报, 2011, 30(8): 812-818.
  • 时间:0
  • 分类:O224[理学—运筹学与控制论;理学—数学]
  • 作者机构:[1]大连理工大学系统工程研究所,大连116024
  • 相关基金:国家自然科学基金资助项目(70871015); 国家高技术研究发展计划(863计划)资助项目(2008AA04Z107)
  • 相关项目:时间序列数据挖掘中的聚类模型与算法研究
中文摘要:

本文从信息论的角度考虑了聚类问题,将聚类看成是有损信息压缩的过程。首先运用率失真理论建立了模糊聚类的优化模型,与经典的模糊聚类模型相比,模型的目标函数中多了一个描述聚类过程复杂度的指标。同时为了估计聚类数目,还提出了一个新的聚类有效性指标。其次通过求解优化模型得到基于率失真理论的模糊聚类算法。最后将基于率失真理论的模糊聚类算法与经典模糊C均值算法进行了数值实验比较。数值实验结果表明基于率失真理论的模糊聚类算法能够自动确定聚类数目,在运行时间上比模糊C均值算法有一定减少,且最终的模糊划分矩阵与模糊C均值算法相比有较少的模糊性,因而聚类结果更加明确可靠。

英文摘要:

Clustering is considered as a process of lossy compression from an information theory perspective in this paper.Firstly an optimization model of fuzzy clustering is built by using the rate distortion theory.Comparing to the classic fuzzy clustering model,the new model introduces a new index in the objective function which describes the complexity of clustering process.In order to estimate the number of clusters,a new cluster validity index is also proposed.Then the fuzzy clustering algorithm based on rate distortion theory is obtained by solving the optimization model.Finally some numerical experiments are made to compare the fuzzy clustering algorithm based on rate distortion theory with fuzzy c-means.The experimental results indicate that the fuzzy clustering algorithm based on rate distortion theory can estimate the number of clusters automatically and it also has less running time than fuzzy c-means.Moreover,membership assignments of the proposed algorithm based on rate distortion theory are less confused than fuzzy c-means,which makes the result more definite and reliable.

同期刊论文项目
同项目期刊论文