挖掘最大频繁模式是多种数据挖掘应用中的关键问题。采用Apriori类的候选生成-检验方法或基于FPTree的挖掘方法需要产生大量候选或动态创建大量条件模式树,代价太高。因此,提出一种挖掘最大频繁模式的新算法。该算法利用前缀树压缩存放数据,并通过调整前缀树中节点信息和节点链直接在前缀树上采用深度优先的策略进行挖掘,既不需要生成候选也不需要创建条件模式树,提高了挖掘效率。
Mining maximum frequent patterns is a key problem in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach or based on FP-Tree. However, candidate set generation and creating conditional FP-Tree dynamically are very costly. In this paper, a new algorithm based on Prefix Tree for mining maximum frequent patterns is proposed. Prefix Tree stores information in a highly compact form. MFP mines frequent patterns in depth first order and directly in Prefix Tree by adjusting node information and node links without creating conditional pattern tree. Thus, it improves performance greatly.