在频繁模式挖掘(FPM)的研究中,为了在海量数据流中有效地挖掘子树结构的频繁模式,根据数据流和子树模式的特点,提出了一种基于数据流的频繁标记闭子树挖掘(SFCL Tree Miner)算法.该算法首次对动态数据流中频繁标记闭子树的挖掘进行研究,给出了在数据流中标记闭子树集合添加、删除的批量挖掘方法,并结合时间衰减模型,有效保证了结果的时效性.实验结果表明,该算法在挖掘性能,如挖掘时间和内存占用等方面,比类似算法有较大提高.
Compared with the classic frequent pattern mining (FPM) algorithms, the dynamic FPM algo- rithms on fast and massive data streams have become top research nowadays. A new batch mining algo-rithm in data streams called stream frequent closed labeled tree miner (SFCLTreeMiner) is proposed. SFCLTreeMiner uses a kind of adding-removing method between closed tree sets. Also it provides a time decay module for reasonable data updating. Experiment shows that SFCLTreeMiner is efficient in data streams mining by reducing consuming dramatically.