提出了一种改进的Apriori关联规则挖掘算法,称为Apriori-BR。该算法首先通过扫描两次数据库建立各个频繁项目集到事务的倒排索引,并对倒排索引按照事务长度进行分组,然后在挖掘过程中,利用位运算加快子集的检测,并在必要时动态删除无效的低维事务。实验结果表明,相比于经典的Apriori算法和已有文献中的改进算法,本文所提的Apriori-BR算法显著提高了挖掘效率。
An improved Apriori algorithm for association rule mining called Apriori-BR was proposed, which was based on bit operation and reverse index. Specifically, the reverse index from frequent itemsets to transactions was constructed firstly by scanning twice of database, and the reverse index was grouped by the length of transactions. Then in the mining process, bit operation was adopted to accelerate subset detection together with the dynamical elimination of invalid low-dimensional transactions. The numerical results show that the Apriori-BR proposed can substantially improve mining efficiency when compared with the conventional Apriori algorithm and the improved ones in the literature.