东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于分类树的高效关联规则挖掘算法

期刊名称：江苏大学学报（自然科学版）, 2006, 27(1): 51-54
时间：0
分类：TP317.4[自动化与计算机技术—计算机软件与理论;自动化与计算机技术—计算机科学与技术]
作者机构：[1]江苏大学计算机科学与通信工程学院,江苏镇江212013, [2]天津工业大学计算机技术与自动化学院,天津300160
相关基金：国家自然科学基金资助项目（60572112）;镇江市社会发展基金资助项目（SH2003014）
相关项目：基于数据挖掘的医学图像分类研究

关键词：数据挖掘, 关联规则, 分类树, 频繁项目集, data mining, association rule, cassification tree, frequent itemset

中文摘要：

在分析类Apriori算法存在效率瓶颈的基础上，提出了一个高效改进算法——基于分类树的关联规则挖掘算法．该算法只需要两次访问数据库，把数据库中的数据利用分类树来存储，减少了访问数据库的次数；并且由分类树的全部或部分来求得频繁项目集，减少了求频繁项目集的比较次数．此算法通过结合Apriori和FP—tree两种算法来提高挖掘效率，降低了挖掘算法的时间复杂度和空间复杂度．通过多次试验证明该算法比Apriori及其改良算法的挖掘效率高2到8倍．

英文摘要：

Based on the analysis of the bottleneck performance for Apriori-like algorithm, an efficient algorithm for faster mining frequent itemset is proposed which is named the Classification Tree Based Association Rule（CTBAR）. The CTBAR scans the database only twice. It adopts classification tree to store the data in database and utilizes all or some of the classification tree to calculate the frequent itemset,which can reduce the times to access database and decrease the comparative times during calculating the frequent itemset. CTBAR improves the efficiency of data mining by combining the two methods ： reducing the time and space complexity, ensuring the correctness of the mined results. Several experiments assess the relative performance of the algorithm in comparison with the Apriori and its extended algorithm. The experiment evaluation shows that the algorithm is faster than the other two algorithms by a factor from two to eight.

同期刊论文项目