东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于密度与最小距离的K-means算法初始中心方法

ISSN号：1673-629X
期刊名称：《计算机技术与发展》
时间：0
分类：TP301.6[自动化与计算机技术—计算机系统结构;自动化与计算机技术—计算机科学与技术]
作者机构：南京邮电大学计算机学院,江苏南京210003
相关基金：国家自然科学基金资助项目（61302157）

作者：戚后林, 顾磊

关键词： K-MEANS算法, 类簇中心, 密度, 最小距离, 迭代次数, K -means algorithm , cluster center, density , minimum distance, iteration number

中文摘要：

为了克服在传统K—means聚类算法过程中因初始类簇中心的随机性指定所带来的聚类结果波动较大的缺陷，提出了一种基于密度与最小距离作为参数来确定初始类簇中心的算法。该算法根据一定的规则计算数据对象的密度参数，在计算完数据集中每条数据的单点密度之后，计算每个数据对象与较其密度大的其他数据对象的最小距离，以密度和最小距离作为参数，选取密度和最小距离同时较大的点作为K—means聚类过程的初始类簇中心。实验结果表明，在类簇数目确定的情况下，应用该算法确定的初始K—means类簇中心，在标准的UCI数据集上能够进行K-means聚类，且与随机选择类簇中心和其他使用密度作为参数的算法相比，基于改进后的初始中心方法的K-means聚类算法具有较高的准确率和更快的收敛速度。

英文摘要：

In order to overcome a large fluctuation caused by the traditional K -means algorithm clustering with assignment of the random initial cluster centers, an algorithm taken the density and minimum distance as the parameters to determine the initial cluster centers is pro- posed, which calculates the density parameter of the data object according to certain rules and minimum distance between each data object and other data objects after having calculated single point density of each data in the data set. The larger one among the densities and min- imum distances has been chosen as initial cluster center in the process of K-means clustering. Experimental results show that it has higher accuracy and faster convergence rate compared with ones using randomly selected cluster centers and using density as a parameter for K - means clustering on standard UCI data set.

同期刊论文项目