东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于节点数据密度的分布式K-means聚类算法研究

ISSN号：1001-3695
期刊名称：计算机应用研究
时间：2011
页码：3643-3645+3655
分类：TP393[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]江苏大学计算机科学与通信工程学院,江苏镇江212013
相关基金：国家自然科学基金资助项目（61005017）; 国家科技创新基金资助项目（10C26213200946）; 江苏省自然科学基金资助项目（BK2009199）; 江苏省高校自然科学基础研究资助项目（10KJB520005）; 江苏大学高级人才资助项目（1283000347）; 江苏省科技创新资助项目（BC2009265）
相关项目：基于P2P网络分布式图像标注方法研究

关键词：点对点技术, K-means聚类, 自适应, 置信半径, P2P, K-means clustering, self-adjustment, confidence radius

中文摘要：

P2P（peer-to-peer）网络分布式聚类算法是利用P2P网络上各个节点的计算、存储能力以及网络的带宽,将算法的时间复杂度和空间复杂度平摊到各个节点,使处理和分析海量分布式数据成为可能,从而克服传统基于单个服务器的集中式聚类算法在数据处理能力等方面的限制。提出一种基于节点置信半径的分布式K-means聚类算法,该算法通过计算节点上数据分布的密度,找到同一类数据在节点的稠密和稀疏分布,从而确定聚类置信半径并指导下一步的聚类。实验表明,该算法能够有效地减少迭代次数,节省网络带宽;同时聚类结果也接近集中式聚类算法的结果。

英文摘要：

The distributed clustering algorithm over the P2P（peer-to-peer） network can share the time and space complexity equally to each peer with utilizing computing and storage capacitates in them,as well as the bandwidth of the network.It overcomes the limitation of traditional central clustering algorithms in processing distributed data and makes it possible to process and analyze mass distributed data.This paper presented a distributed K-means clustering algorithm based on the confidence radius in local peer.The algorithm calculated the data density in local peer to find the dense and sparse distribution in the same cluster,which was used to deduce the confidence radius to guide the next clustering processing.Experimental results show that the algorithm can effectively reduce the number of iterations and save network bandwidth.Meanwhile,the clustering results in this algorithm are closed to those in the centralized clustering algorithm.

同期刊论文项目