相对于频繁项集,最大频繁项集的数目较少,挖掘最大频繁项集的算法具有较高的时空效率。提出了一种新的基于文法顺序FP-Tree的最大频繁项集单遍挖掘算法FPMFI-DS。该算法采用了一种混合搜索空间项顺序策略,并利用我们所提出的一种新的剪枝技术—“子集等价剪枝技术”,有效缩小搜索空间的大小。基于该算法,提出了一种能够在线更新挖掘数据流滑动窗口中最大频繁项集的算法FPMFI-DS+。FPMFI-DS+算法能够在任意时刻都维护数据流当前窗口中的最大频繁项集。仿真实验表明,FPMFI-DS算法的效率接近于多遍挖掘算法FPMax^*,并具有良好的可扩展性,FPMFI-DS+算法更新挖掘速度快。
For the number of maximal frequent itemsets (MFIs) is less than that of frequent itemsets, the efficiency of algorithm for mining MFIs is higher. A novel single-pass lexicographical-order FP-Tree based algorithm, FPMF1-DS was proposed. FPMFI-DS uses a kind of mixed item. ordering policy and imports a new pruning technique, subset equivalence pruning technique. These two techniques effectively decrease the size of searching space. Based on FPMFI-DS, another algorithm, FPMFI-DS+ was proposed, which'could mine MFls in sliding window over data streams in an online updating fashion. FPMFI-DS+ can maintain MFIs in current sliding window over data streams at any time. The experiments show that FPMFI-DS is comparable with multi-pass algorithm FPMax^* regarding with the efficiency, and has good scalability, and FPMFI-DS+ has high updating-miningspeed.