频繁模式挖掘是数据挖掘的核心问题。传统上,频繁模式并行挖掘主要是在集群上进行的,较少涉及共享内存多处理系统上的并行挖掘。基于广度优先搜索和直接计数策略研究了一种并行挖掘方法,并在图形处理器(graphics processing unit,GPU)最新统一计算设备架构CUDA(compute unified device architecture)下进行了实现。GPU-based FPMA用CPU控制搜索进程;在GPU的多处理器上,采用数据划分的计算策略,以适合GPU的顺序数据流方式计数,并根据候选项的长度动态剪枝事务数据集。实验结果表明,GPU-based FPMA比CPU版本平均加速了10倍以上。
Frequent pattern mining is an important issue in data mining area. Traditionally, parallel frequent pattern mining is carried out in PC clusters, and seldom related to multi-processors or massive cores with shared memories. In this paper, we propose a parallel frequent pattern mining algorithm suitable for GPU (graphics processing unit) based on width search and direct support strategy. It is implemented under compute unified device architecture (CUDA) of GPU. In this algorithm, CPU takes charge of search process and GPU is responsible for counting using data partition. In addition, transactions are dynamically pruned according to the length (k) of candidate frequent itemsets. Performance analysis shows that GPU-based FPMA reaches an average speed as fast as that of 10 times of CPU-based counterpart.