东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

分类属性高维数据基于集合差异度的聚类算法

ISSN号：1001-053X
期刊名称：北京科技大学学报
时间：0
页码：1085-1089
语言：中文
分类：TP311[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术]
作者机构：[1]北京科技大学经济管理学院,北京100083
相关基金：国家自然科学基金资助项目（No.70771007）
相关项目：高维稀疏数据聚类研究

关键词：聚类, 高维空间, 集合, 差异度, 数据挖掘, clustering, high-dimensional space, sets, dissimilarity, data mining

中文摘要：

提出基于集合差异度的聚类算法．算法通过定义的集合差异度和集合精简表示，直接进行一个集合内所有对象总体差异程度的计算，而不必计算两两对象间的距离，并且在不影响计算精确度的情况下对分类属性高维数据进行高度压缩。只需一次数据扫描即得到聚类结果．算法计算时间复杂度接近线性．实例表明该算法是有效的．

英文摘要：

A clustering algorithm is proposed based on set dissimilarity. Through defining set dissimilarity and set reduction, it does not calculate the distance between each pair of objects but computes the general dissimilarity of all the objects in a set directly, re- duces high-dimensional categorical data enormously without loss of computation accuracy and gets the clustering result by only once data scanning. The time complexity of the algorithm is almost linear. An example of real data shows that the clustering algorithm is effective.

同期刊论文项目