位置:成果数据库 > 期刊 > 期刊详情页
GPU加速的改进PAM聚类算法研究与应用
  • ISSN号:1560-8999
  • 期刊名称:《地球信息科学学报》
  • 时间:0
  • 分类:TP391.41[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]北京大学遥感与地理信息系统研究所,北京100871, [2]中国矿业大学(北京)资源与安全工程学院,北京100083
  • 相关基金:国家重点研发计划重点专项(2016YFC0801800)
中文摘要:

空间聚类是空间数据挖掘的重要方法,而K-Medoids是一种常用的空间聚类算法。K-Medoids聚类算法存在初始点选择问题,而且计算复杂。为了提高算法的有效性和时间效率,本文结合模拟退火算法思想,改进了传统的K-Medoids算法PAM,提出一种基于GPU计算的并行模拟退火PAM算法。类比矩阵乘法运算,定义了一种新的矩阵计算方法,可以有效减少数据在GPU全局内存和共享内存之间的传输,提高了算法在GPU中的执行效率。利用模拟退火算法搜索聚类中心点,保证了聚类结果的全局最优性。基于不同的数据集,将串行和并行模拟退火PAM算法以及已有的遗传PAM算法进行比较,结果表明并行模拟退火PAM算法聚类结果正确,且时间效率高。最后,应用本文改进算法对贵州省安监系统的安全监管隐患数据进行聚类分析,发现了隐患聚集中心,相关结果对政府的决策具有一定的实际应用价值。

英文摘要:

Spatial clustering is one of the most important methods in spatial data mining. As a common but powerful spatial clustering algorithm, K-Medoids is applied in many fields such as generalization of spatial entity information, spatial point pattern analysis and epidemiology application. However, K-Medoids algorithm meets two main challenges innately as follow. At first, K-Medoids has selection problem of the initial medoids. Different initial medoids may not attain the same clustering results which could lead to a non-optimal results sometimes. Furthermore, time efficiency of the algorithm is not satisfactory because there exist quantities of iterations to find the most suitable partition. Existing studies on the K-Medoids algorithm don't take the validness and time efficiency into consideration at the same time. Optimal methods like the Genetic Algorithm are applied to improve the validness of K-Medoids but the time efficiency is not acceptable when dealing with growing data. The MapReduce model is utilized to handle with data of high volume which can' t adapt to some circumstances short of computer clusters. In order to improve the result validity and time efficiency of the algorithm, this paper revised the traditional K-Medoids algorithm of Partitioning Around Medoids (PAM) combining with the idea of the Simulate Anneal Arithmetic (SAA) and proposed a parallel Simulate Anneal Partitioning Around Medoids (SAPAM) algorithm which was implemented efficiently in Graphics Processing Units (GPUs). SAA algorithm is used to search for the initial medoids which promises the validness of the algorithm. The stochastic factor introduced in SAA algorithm gives the possibility of eliminating the local optima to attain the global optimal clustering results of PAM. To accelerate the clustering process, we design the parallel SAPAM algorithm to utilize quantities of GPU' s threads which execute the program at the same time. By analogy with the matrix multiplication, a new matrix computation method is defined

同期刊论文项目
同项目期刊论文
期刊信息
  • 《地球信息科学学报》
  • 中国科技核心期刊
  • 主管单位:中国科学院
  • 主办单位:中国科学院地理科学与资源研究所 中国地理学会
  • 主编:徐冠华
  • 地址:北京大屯路甲11号
  • 邮编:100101
  • 邮箱:sxfu@lreis.ac.cn
  • 电话:010-64888891
  • 国际标准刊号:ISSN:1560-8999
  • 国内统一刊号:ISSN:11-5809/P
  • 邮发代号:82-919
  • 获奖情况:
  • 国内外数据库收录:
  • 中国中国科技核心期刊,中国北大核心期刊(2008版),中国北大核心期刊(2011版),中国北大核心期刊(2014版)
  • 被引量:3181