针对SVM(support vector machine)算法应用到大规模网络流量分类中存在计算复杂度高、训练速度慢等问题,提出一种基于云计算平台进行并行网络流量分类的SVM方法,以提高对大数据集的分类训练速度。该方法是一种采用云计算平台构建多级SVM和映射规约(MapReduce)模型的方法。它将训练数据集划分为多个子训练数据集,通过对所有子训练数据集进行并行训练,得到支持向量集,进而训练出流量分类模型。实验结果表明,与传统的SVM方法相比,并行SVM网络流量分类方法在保持较高分类精度的前提下,有效地减少了训练时间,提高了大规模网络流量分类的速度。
In order to solve high complexity and slow training speed of SVM(support vector machine) algorithm on large network classification dataset,a parallel SVM network traffic classification method is presented,which is based on cloud computing platform to improve the training speed of SVM algorithm on large dataset.This method uses cloud computing platform to build multistage SVM and MapReduce model.The dataset is splited into some sub-datasets,and then trains the sub-datasets parallel to get support vectors set for traffic classification model.Compared with traditional SVM algorithms,experimental results show that parallel SVM network traffic classification method maintains high classification accuracy,reduces training time effectively and improves the speed of classification for large scale of network traffic data.