全局负关联规则挖掘是多数据库关联信息挖掘的重要研究内容,具有广泛的应用范围和使用价值.合并各子数据库的负关联规则是现有全局负关联规则挖掘常用的方法,但数据密度大、规则不全面及运算时间高等问题影响了已有全局负关联规则挖掘方法的效率.本文给出一种新的全局负关联规则挖掘算法,其具体步骤为:(1)扫描各子数据库,建立多数据库频繁模式树;(2)依据频繁项集全局一致性原则,对多数据库频繁模式树执行精简操作;(3)在此基础上产生全局极小非频繁项集;(4)依据极大频繁项集向上闭包原则,产生全局非频繁项集;(5)在规则相关度的基础上提取全局负关联规则.大量的对比实验结果表明,本文算法具有快速发现全局负关联规则的能力.
Recently,mining global negative association rules in multi-databases has beome an important research area.Most existing researches focus on unifying all negative rules from different single databases into a unit one.However,these methods suffer form some problems such as high data density,incomplete rules and high time consumption.In this paper,a novel method is presented,i.e.,GNAR(Global Negative Associate Rules in multi-databases),to tackle these problems.Firstly,a Multi-Database Frequent Pattern tree(MDFP-tree) is constructed by scaning multi-databases.Secondly,the MDFP-tree is pruned according to the principle of global consistency of frequent itemsets.Thirdly,the global small infrequent itemset(SIFS) is produced and the global infrequent itemset is generated with the upward closure frequent itemsets.Finally,the global negative association rules are extracted based on correlation metric.Experimental results show that our proposed method has the ability of mining negative association rules in multi-databases quickly.