在线挖掘滑动窗口中的频繁闭项集是一类重要的数据流挖掘问题。提出了一种新的频繁闭项集挖掘算法FPCFI-DS。该算法能够在有限的存储空间中高速挖掘数据流滑动窗口中的频繁闭项集,并且能够在任意时刻维护当前窗口中精确的频繁闭项集。对于第一个窗口中的数据,FPCFI-DS算法采用单遍过程FPCFI进行挖掘,挖掘结果被保存于一棵全局闭项集树GCT中。当窗口向前滑动时,FPCFI-DS算法采用更新挖掘方式快速挖掘出当前窗口中的频繁闭项集。实验结果表明,FPCFI-DS算法的空间效率和时间效率都显著优于同类经典算法Moment。
Online mining closed frequent itemsets in sliding window is one of the most important issues for mining data streams. A novel algorithm, FPCFI DS, is proposed, which can efficiently mine closed frequent itemsets over a stream sliding window with limited memory space, and maintain exact closed frequent itemsets in current window at any time. For data in the first window, the algorithm FPCFI-DS mines closed frequent itemsets using single-pass procedure, denoted as FPCFI. The resulting closed frequent itemsets are stored in a global closed frequent itemsets tree (GCT). When the window slides forward, the FPCFI DS quickly updates closed frequent itemsets in current window using the updating-mining method. The experimental results show that FPCFI-DS is superior to that of state-of-the-art algorithm Moment in terms of time and space efficiency.