分布式拒绝服务(distributed denial-of-service,DDoS)是目前常见的网络攻击方式之一。基于机器学习算法(SVM、HMM等)的DDoS攻击检测技术取得一些进展,但还存在着样本数量过多时易发生过拟合和未充分利用上下文信息等不足。为了弥补以上不足,提出一种基于随机森林的DDoS攻击检测方法,将数据流信息熵作为分类标准,令sourceIP、destinationIP、destinationPort分别代表数据流的源地址、目的地址、目的端口,采用SIDI(sourceIP-destinationIP)、SIDP(sourceIP—destinationPort)和DPDI(destinationPort-destinationIP)三个信息熵来分别表征三种多对一的特征,对TCP洪水攻击、UDP洪水攻击、ICMP洪水攻击等三种常见的攻击方式进行特征分析,在此基础上使用基于随机森林分类模型分别对三类DDoS攻击方式进行分类检测,实验结果表明该模型能够较为准确地区分正常流量和攻击流量,与HMM、SVM方法相比,基于RFC模型的DDoS检测方法有较高的检测率和较低的误报率。
DDoS attack is one of the major Internet threats. Traditional DDoS detection technology based on machine learning (SVM,HMM) has been some progress, but there are some shortcomings, such as the number of samples prone to excessive over-fitting and underutifized contextual, information. To compensate for the above shortcomings, this paper proposed a DDoS attack detection method based on random forest, and defined the data stream information entropy as the classification standard. It used sourceIP, destinationlP, destinationport to represent data flow source address, destination address, destination port, used SIDI, SIDP and DPDI to represent three kinds of many to one features to analyze TCP flood attacks, UDP flood attacks, ICMP flood attack. On this basis, it used the classification model based on random forest respectively to classify three kinds of attack, to complete the detection of DDoS attacks. Experimental results show that the model can accurately distinguish between normal traffic and attack traffic. Compared with the HMM and SVM method, RFC model has a higher detection rate and low false alarm rate.