由于对组织或个人采取针对性的攻击,僵尸网络对因特网构成越来越严重的威胁.并且不同的加密方法以及隐蔽的通信信道使得p2p僵尸网络越来越难以检测.之前有很多基于分类检测算法的文献都有很高的整体正确率,但是单独类并没有很高的正确率.同时,之前的文献并没有考虑到正常的网络流量和僵尸网络流量严重不平衡的问题.为了解决以上两个问题,提出一种基于最近邻规则欠抽样方法(ENN)和ADASYN(Adaptive Synthetic Sampling)结合的不均衡数据SVM分类算法应用于P2P僵尸网络检测.实验结果表明,无论是僵尸网络还是正常的流量,该方法都具有很高的正确率,并能在短时间内达到很好的分类效果;较之其他算法,它更适合处理大规模网络实时环境中大量的原始数据,对统计数据依赖性小,对不均衡数据分类具有较好的鲁棒性.因此,基于不均衡数据ENN-ADASYN-SVM分类算法更适应于复杂多变的网络环境下的P2P僵尸网络检测.
Botnets are becoming more and more dangerous to the Internet becanse they launch targeted attacks towards organizations and the individuals. At the same time, P2P botnets are more difficult to be detected due to their nature of using different encryption techniques and covert communication channels. A few of classification algorithms used for P2P botnets detection proposed in many lit- eratures have reported high overall accuracy of their classifier but failed to recognize individual class at the similar rates. In addition, those algorithms proposed in those papers ignored the serious imbalance problem between the normal network traffic and P2P botnets traffic. In order to solve the two problems put forward above, a new SVM algorithm based on edited nearest neighbor under-sampling and Adaptive Synthetic Sampling(ADASYN) was presented in this paper to detect P2P botnets. Experimental results show its high o- verall accuracy and its good classification performance for both botnets and legitimate classes within short time. Compared with other algorithms, it is more suitable for processing large amounts of raw data in the real-time network environment. Besides , it has small de- pendence on datasets if balanced or not. Moreover, it has a good robustness for imbalanced dataset classification. Therefore, this im- proved SVM algorithm based on edited nearest neighbor under-sampling and Adaptive Synthetic Sampling is more appropriate for P2P botnets detection in complex environment.