因为组合爆炸,经常的子树的数字通常与树尺寸指数地成长。作为结果,有太多经常的子树让用户设法并且使用。为了解决这个问题,我们基于 δ-cluster 概括一个压缩框架到压缩经常子树的集合的问题,并且建议一个算法,装我的 RPTlocal 压缩了直接设置的经常的子树。这个算法牺牲理论界限,但是 still 有好压缩质量。由修剪搜索空间并且直接产生经常的子树,这个算法也是有效的。实验结果证明由 RPTlocal 的代表性的子树采矿是几乎二个数量级不到关上的子树的整个收集,并且比 CMtreeMiner 更有效,为采矿的算法关门并且最大的经常的子树。
The number of frequent subtrees usually grows exponentially with the tree size because of combinatorial explosion. As a result, there are too many frequent subtrees for users to manage and use. To solve this problem, we generalize a compressed frame based on δ-cluster to the problem of compressing frequent-subtree sets, and propose an algorithm RPTlocal which can mine compressed frequent subtrees set directly. This algorithm sacrifices the theoretical bounds but still has good compression quality. By pruning the search space and generating frequent subtrees directly, this algorithm is also efficient. Experiment result shows that the representative subtrees mining by RPTlocal is almost two orders of magnitude less than the whole collection of the closed subtrees, and is more efficient than CMtreeMiner, the algorithm for mining both closed and Maximal frequent subtrees.