由于网络环境的动态性及网络应用分布变化等原因,导致在P2P流量识别过程中面临概念漂移问题.现有的静态流量识别模型无法处理此类问题,因此文中提出一种新的P2P流量识别模型,解决了P2P流量识别过程中存在噪声和概念漂移问题.借助聚类思想利用K近邻算法,实现噪声过滤.根据评估假设原理和中心极限定理实现了概念漂移检测,在此基础上利用基分类器的不确定性输出结果实现多分类器集成方案,同时基于时间策略淘汰概念过时的基分类器.理论分析和仿真实验结果证明了本文所提的算法模型是可行的.
Due to the dynamic nature of network environment and network applications,P2P traffic identification confronts the problem of concept drift. Static traffic identification model doesn't deal with the problem,hence,a new algorithm model is proposed, which can solve noise and concept drift in P2P traffic identification. According to the idea of cluster, K nearest neighbor is used to filter noise. In the light of the principles of evaluation hypothesis and central limit theorem,concept drift detection method is implemented,and then ensemble learning scheme is got by the uncertainty output of sub-classifier. Meantime,the obsolete sub-classifier is eliminated by time policy. Through the analysis of theory and simulation experimental results, it shows that the proposed algorithm is feasible.