为了提高挖掘关联规则的效率,提出基于改进FP—Tree结构的最大频繁项集挖掘算法。介绍并分析了挖掘最大频繁项集的过程和现有算法,指出现有算法中耗时的关键步骤。克服了MMFI算法中需要反复从头表出发沿相同项目结点链搜索右侧结点的缺点,提出一种改进的最大频繁项集挖掘算法IMMFI。通过在有序FP-Tree中引入叶子链,用沿叶子链搜索取代沿同层结点链搜索,有效地减少了搜索的次数,提高了算法的效率。实验结果表明了该算法的性能良好。
To improve the efficiency of mining the association rules, an algorithm for mining maximal frequent itemsets based on improved frequent pattern tree is presented. Firstly, the process of mining maximal frequent itemsets and existing algorithms are introduced and analyzed, and the primary approach costing time in existing algorithms is indicated. An improved algorithm IMMFI (improved mining maximal frequent itemsets) for mining maximal frequent itemsets is proposed overcoming the disadvantage of searching the right nodes along the same nodes link from head table in algorithm MMFI repeatedly. The times of searching is reduced and the efficiency is improved through introducing the leaf nodes link to the improved frequent pattern tree and replacing searching along the same nodes link with along the leaf nodes link. Finally, the performance of the algorithm IMMFI is demonstrated by instance and experiment.