频繁模式在许多数据挖掘任务中都起着十分关键的作用,现有的支持度计算方法存在着效率不高、I/O和时间开销较高等缺点.本文以提出了一种用多级位图目录来实现项集支持度计算的方法,给出了多级位图目录的结构和相应算法;对位图的组织采用了一种可伸缩的动态分块管理机制,在此基础上对位图进行了编码压缩(即以一个短码来替代一个长向量块),在较大程度上减少了对磁盘及主存空间的需求;最后,在实验的基础上对算法的性能进行了分析.基于多级位图目录的项集支持度计算算法具有结构简单、空间和时间开销小等优点.
Frequent patterns play an essential role in many data mining tasks. Most of the existing supports counting technologies require high I/O costs and computing overheads. A new method based on multi - level bitmap catalogue for itemsets supports determining is proposed. A detailed description of the structure of multi-level bitmap catalogue as well as algorithms using multi-level bitmap catalogue to implement supports counting is given. And a dynamic bitmap management mechanism found on block-partitioned is employed, and hence every block is encoded as a shorter symbol to reduce the disk and main memory requirements. Experimental and analytical results are presented in the end. The supports counting algorithms based on multi-level bitmap catalogue introduced in this paper has some advantages, such as simple structure, low overhead of space and time, and so on.