由于数据流滑动时间窗口中流数据包含模式的支持度是动态变化的,很难给出一个合适的支持度门限来挖掘数据流滑动时间窗口内的频繁模式.在研究数据流滑动时间窗口内流数据变化特点的基础上,论文提出了一种挖掘数据流滑动时间窗口内Top-k频繁模式的方法,该方法能够在保证模式挖掘误差基础上快速删除窗口内不频繁模式信息,保留重要的模式信息,并能按照支持度降序输出Top-k频繁模式.仿真实验结果表明,该算法具有较好的效率和正确性,并优于其它同类算法.
Due to the supports of the patterns in the sliding window of a data stream are varying as stream data arrive rapidly,it is difficult to give an approximate support threshold when mining the frequent patterns in the sliding window.After studying the characteristics of the stream data in the sliding window,a method for mining the top-k frequent patterns from the sliding window is proposed.With the error guaranteed,the method can efficiently prune the infrequent patterns,save the significant patterns,and mine the top-k frequent patterns in frequencies descendant order.The results of simulations show that this method is efficient and correct,and also superior to other analogous algorithms.