提出一种基于多分类-关联规则的数据流分类算法——SCMAR,通过改进CMAR算法中FP-tree的建立过程,使FP-tree的时间和空间效率得到提高。利用Hoeffding边界使算法能挖掘并维护数据流中所有的频繁规则,用CR-tree存放挖掘出的规则,为每条规则存放统计信息,使分类时能够对各个规则进行评价,选择适当的规则进行分类。理论分析和实验表明,该算法是有效可行的。
This paper proposes an algorithm for classification of data stream based on multiple class-association rules——SCMAR.It changes the construct process of FP-tree to improve its time and space efficiency,computes and maintains all the frequent rules by using Hoeffding bound and dynamically updates them with the incoming data stream.It stores the rules with CR-tree,and stores the statistic information for each rule,so when classing the data,it can select appropriate rule to construct classifier.Theory analysis and experimental results show that SCMAR algorithm is efficient and effective.