网络应用识别是网络管理、研究、规划、安全等一系列事务的基本前提,基于分组端口号和分组载荷的应用识别技术逐渐不能满足需求.根据不同应用具有各不相同的流量特性这一原理,可利用机器学习技术挖掘各种应用的流量模式,从而进行有效识别.本文使用简单的流量特征作为观测值进行有监督应用识别.通过比较多种通用的机器学习算法,找出最适用于应用识别问题的有监督学习方案,同时应用特征选择算法找出关键的流量特征.
Application identification plays an important role in various network activities. Due to the ineffec- tiveness of traditional port-based and payload-based methods, recent works proposed using machine learning tech- niques to identify application based on statistical characteristic of traffic flows. In this study, we use simple charac- teristic to describe traffic flows, and then identify the most suitable supervised ML classifier for the application i- dentification problem by comparing various ML schemes. We also apply feature selection to identify the most sig- nificant features.