基于机器学习的网络流量识别技术作为一种典型的数据流分类的应用,对概念漂移检测方法的要求越来越高。针就这个问题,首先分析了概念漂移检测的两种典型方法,然后结合实际的网络环境中经常存在类别不平衡的特性提出了一种检测概念漂移的算法CF—CDD,并对该算法的原理和统计学理论基础进行了详细的论述。再根据提出的概念漂移检测算法构建基于权重的集成分类器算法TCEL—CF—CDD,以达到自适应流量识别的目的。最后进行实验,验证了文中提出的概念漂移检测算法的可行性。
Concept drifting detection is strictly required by network traffic identification based on machine learning, as a typical application of data stream classification. In order to solve the problem, firstly, this paper analyzes two kinds of typi- cal method of concept drifting detection. Then, combining the actual non - stationary network environment, the paper pres- ents the new method of concept drifting detection, called CF CDD, and its basic theory is discussed in detaih Afterit, ac- cording to the result of CF_CDD, the paper builds TCEL CF CD integrated classifier based on the weighting algorithm, to achieve the goal of adaptive traffic identification. The experiment results verify the feasibility of the algorithm TCEL CF CDD, which is proposed in this paper.