Apriori算法是经典的数据挖掘算法之一,它根据置信度和支持度对产生的频繁集进行选择,找出强规则.传统的Apriori算法需要产生大量的侯选集和多次数据库的扫描,存储和通信的开销巨大.云计算环境可以解决存储问题,所以针对Mapreduce的编程框架,提出一种适用于此模式的新关联规则算法,解决传统Apriori算法时间和空间上的缺点,提高挖掘效率.
Apriori algorithm is one of the most classic data mining algorithms,it selects the rules depending on the confidence and support of the frequent items,then finds out the strong rules.The traditional Apriori algorithm should generate large candidate itemts and scan the database repeatedly,the spending of storage and communication is huge.Cloud computing environment can solve the problem of storage,so according to the Mapreduce programming framework,it puts forward a new association rule algorithm to apply to this mode,solves the shortcomings of the tranditional Apriori algorithm in time and space,improves the efficiency of mining.