为了能有效应对数据流中的概念漂移现象,提出结合无监督学习的数据流分类算法.该算法以集成式分类技术为基础,在分类过程中引入属性约简,利用聚类算法对数据进行聚类,通过对比分类和聚类结果的准确率,判断是否发生概念漂移.实验表明,文中算法在综合时间花销和准确率上取得较好效果.
An ensemble learning techniques based algorithm combined with unsupervised learning is proposed for concept drift problem of data stream. An attribute reduction mechanism is introduced into classification process and then a clustering algorithm is applied to the data for clustering. Accuracies of classification and clustering are compared to decide whether concept drift appears or not. The experimental results show that the proposed algorithm efficiently decreases time consumption and improves the precision.