东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于多重分形的聚类层次优化算法

期刊名称：软件学报, 2008,19(6).1283-1300
时间：0
分类：TP181[自动化与计算机技术—控制科学与工程;自动化与计算机技术—控制理论与控制工程]
作者机构：[1]西北工业大学计算机学院,陕西西安710072, [2]兰州交通大学电子与信息工程学院,甘肃兰州730070
相关基金：Supported by the National Natural Science Foundation of China under Grant No,60573096 （国家自然科学基金）; the NSFC-JST Major International （Regional） Joint Research Project under Grant No.60720106001 （NSFC-JST 重大国际（地区）合作项目）; the Foundation of Gansu Province Educational Department of China under Grant No.0604-09 （甘肃省教育厅基金）
相关项目：基于模式的高可扩展性P2P数据管理技术的研究

关键词：数据挖掘, 聚类, 多重分形, 后续处理, 优化, data mining, clustering, multifractal, post-processing, optimization

中文摘要：

大量初始聚类结果之间存在强弱不同的相似性，会给用户理解与描述聚类结果带来不利影响，进而阻碍数据挖掘后续工作的顺利展开．传统聚类算法由于注重聚类形状及空间邻接性，或者考虑全局数据分布密度的均匀性，实际中均难以解决这一类问题．为此，提出了基于分形的聚类层次优化算法FCHO（fractal-based cluster hierarchy optimization），FCHO算法基于多重分形理论，利用聚类对应多重分形维数及聚类合并之后多重分形维数的变化程度来度量初始聚类之间的相似程度，最终生成反映数据自然聚集状态的聚类家族树．此外，初步分析了算法的时空复杂性，基于合成数据集和标准数据集的有关实验工作证实了算法的有效性．

英文摘要：

A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters. Moreover, there will exist more or less similarities among these large amounts of initial cluster results in real life data set. Accordingly, analyzer may have difficulty to implement further analysis if they know nothing about these similarities. Therefore, it is very valuable to analyze these similarities and construct the hierarchy structures of the initial clusters. The traditional cluster methods are unfit for this cluster post-processing problem for their favor of finding the convex cluster result, impractical hypothesis and multiple scans of the data set. Based on multifractal theory, this paper proposes the FCHO （fractal-based cluster hierarchy optimization） algorithm, which integrates the cluster similarity with cluster shape and cluster distribution to construct the cluster hierarchy tree from the disjoint initial clusters. The elementary time-space complexity of the FCHO algorithm is presented. Several comparative experiments using synthetic and real life data set show the performance and the effectivity of FCHO.

同期刊论文项目