随着数据挖掘的广泛应用,许多实际的数据挖掘应用需要用过去和当前数据对未来数据状态进行预测,针对这一现状,文中提出基于矩阵的数据流频繁模式预测算法(MFP).MFP算法可预测在下一时间窗口中可能性较大的频繁项集,以满足用户需要.该算法首先将数据转换为0-1矩阵;然后通过矩阵剪裁和位运算更新矩阵,并从中挖掘频繁项集;最后,利用当前窗口数据预测下一时间窗口中可能出现的频繁项集.实验结果表明,MFP算法在不同实验环境下能有效预测频繁项集,该算法是可行的.
With the wide application of data mining, many practical data mining applications need to use past and current data to predict the future state of the data. To solve this problem, we propose a new method (MFP) for predicting frequent patterns over data streams. MFP algorithm can predict those frequent itemsets that have high potential to become frequent in the subsequent time windows, to meet users' needs. Firstly, the algorithm converts the data to 0--1 matrix. Then it will update the matrix by tailoring it and bit operations, from which mine frequent itemsets as well. Finally, it will predict possible frequent itemsets that may appear in the next time window by using the current data. Experimental results show that MFP algorithm can predict the frequent itemsets in different experimental conditions, therefore, the algorithm is feasible.