目前TOP—K高效用模式挖掘算法需要产生候选项集,特别是当数据集比较大或者数据集中包含较多长事务项集时,算法的时间和空间效率会受到更大的影响.针对此问题,通过将事务项集和项集效用信息有效地保存到树结构HUP—Tree,给出一个不需要候选项集的挖掘算法TOPKHUP;HUP—Tree树能保证从中计算到每个模式的效用值,不需要再扫描数据集来计算模式的效用值,从而使挖掘算法的时空效率得到较大的提高.采用7个典型数据集对算法的性能进行测试,实验结果证明TOPKHUP的时间和空间效率都优于已有算法,并对K值的变化保持平稳.
M mining, and utility thresh TOP-K high utility pattern from a it aims to mine the patterns whose utilitie old. At present, it has been a topic in data K high utility pattern generate candidat of a dataset~ this hinders their performa large or there are many long transaction structure called HUP-Tree (high utility dataset is an extension of frequent pattern s are higher than a user-specified minimum mining. Existing algorithms of mining TOP- e itemsets in the mining process and they need multiple scans nce of runtime and memory usage, especially when a dataset is itemsets in a dataset. To address this issue, we propose a tree pattern tree) to maintain transaction values, and we also give an algorithm named TOPKHUP (TOP-K high uti TOP-K high ut utility value of algorithm is eff itemsets and their lity pattern) that ility patterns without generating candidates. HUP-Tree ensures efficient retri each p ectively attern without a improved. Seven dditional scan of the dataset, so the performance classical real and synthetic datasets are used in the utility mines eval of of the testing experiments and the results show that the proposed algorithm outperforms state-of-the-art algorithms significantly for both runtime performance and memory usage, and it is more stable along the change of the value K.