应用识别与流量分类是网络管理、安全、研究等相关事务的必要前提.随着网络的高速发展以及各种新型应用的不断涌现,基于分组传榆层端口号和深度分组解析的分类技术难以满足需求.本文验证网络流量的统计特性可以有效地区分不同应用,提出一种基于C4.5决策树分类器的有监督网络流量分类方法,讨论boosting增强方法和特征选择两种改进.实验结果表明,C4.5分类器的训练复杂度适中,准确率高且分类速度快;增强方法可以进一步提高分类器的准确率,代价是训练时间大幅提高和分类时间稍微减慢;特征选择算法则提高分类速度而稍微降低准确率.
Traffic classification or application identification is an essential step for a number of network issues including management, se- curity and research. The diminished effectiveness of traditional port-based traffic classifier and the overheads of deep packet inspection approaches motivate new techniques. It has been proved that traffic statistics can discriminate between applications, in this paper, we propose a supervised method based on boosted C4.5 decision tree classifier. Experiment results show that C4.5 classifier can perform fast classification and achieve high accuracy ; while boosted C4.5 classifier achieves higher accuracy with much longer training time and slightly slower classify rate.