针对传统分布式数据流挖掘算法的通信开销较大、分类精度较低的问题,提出一种基于支持向量数据描述的分布式数据流挖掘算法。利用局部站点快速更新数据流信息,采用支持向量机算法学习元级数据并传递到中心站点。中心站点负责接收及合并元级数据,形成全局分类结果。实验结果表明,该算法能在降低局部站点和中心站点网络通信量的同时,获得较高精度的全局分类结果。
In distributed data stream mining,communication loads and global classification accuracy are main problems.In order to solve the problem,this paper presents a distributed data stream mining algorithm based on Support Vector Data Description(SVDD).Local site quickly updates data stream information,gets meta-level data by Support Vector Machine(SVM),and transmits them to central site.Central site receives and combines meta-level data,and learns global classification model.Experimental result shows that the algorithm can reduce transmission between local site and central site,and keep better classification accuracy.