高效用模式挖掘是数据挖掘领域的一个基础研究方向,其中关于top-k高效用模式的挖掘算法也越来越多,k指的是用户需要挖掘的高效用模式的个数。它们可以归纳为二阶段top-k算法和一阶段top-k算法两类,两者的主要区别是,前者在挖掘的过程中会产生大量的候选模式,这是影响算法性能的主要因素;后者在挖掘的过程中不产生候选模式。为了更加高效地挖掘效用值最高的k个模式,一阶段算法TKHUP被提出,该算法在进行数据挖掘的过程中主要是通过四个有效策略来减少时间和空间的消耗。通过大量的实验数据表明,TKHUP在时间性能上优于其他top-k高效用模式挖掘算法。
High utility pattern mining is a fundamental research in data mining,in which more and more algorithms about top-k high utility pattern mining algorithms are proposed, where k refers to the number of high utility patterns that users need to mine. It can be classified into two types:two-phase algorithm and single-phase.algorithm. The former generated a huge number of candidates in mining process, which was the primary factor to decreasing the performance of algorithm; the latter mined top-k high utility patterns without candidate generation. To mine the k of the most valuable patterns more efficiently, this paper proposed a single-phase algorithm TKHUP. The proposed algorithm used four effective strategies to save time and space consumption during mining process. A large number of experiments indicates that the performance of TKHUP is the state-of-the-art top- k high utility mining algorithm on time.