分析实际应用中有效访问序列的特点,提出了一种采用自底向上策略快速挖掘最大频繁项集的OUS算法。该算法首先对用户项集进行重叠操作统计浏览次数,然后合并,依据用户给出的最小支持度删除原项集中的非频繁页面元素,并对两两用户项集筛选生成候选频繁项集,最后扫描数据库,统计各个候选频繁项集的支持度计数。实验结果表明,该算法能有效地发现用户最大频繁项集。
The characteristics of effective access sequence in the actual application are analyzed and an efficient algorithm OUS based bottom-up strategy is proposed for mining maximal frequent itemsets.The algorithm first takes count of the browse number of each access sequence by overlapping operation,then unites and deletes the unfrequent page items according to minimum support degree given by users,afterwards sifts getting the intersections of each two user access pattern and gives birth to candidate grequent access patterns, at last, adds up the number of each candidate frequent access pattern by scanning the original database. Experimental results show that the OUS algorithm can discover user maximal frequent access patterns effectively.