关于网络通信流量性能控制问题,针对大量不相关和冗余特征制约网络流量分类性能提高的问题,提出一种混合约束的半监督网络流量特征选择方法.在半监督学习的基础上,采用成对约束和无标记样本相结合的特征评价方法快速去除不相关特征,并通过利用互信息的特征相关性过滤剩余特征中的冗余特征,使有监督信息和无监督信息在网络流量的特征选择过程中以不同的方式发挥作用.实验结果表明,与传统的网络流量特征选择方法相比,改进方法能以更少的特征获得更好的网络流量分类性能.
Aiming at the problem that a large number of irrelevant and redundant features restrict network traffic classification performance,this paper proposed a semi-supervised traffic classification feature selection method based on hybrid constraints.It used the characteristics evaluation method with the combination of pairwise constraints and non-labeled samples to remove the irrelevant features,and used the features correlation based on mutual information to filter redundant features in the remaining features,which enables the supervised and unsupervised information play roles in different ways.Experimental results show that the proposed method can obtain much better network traffic classification performance with fewer features compared with traditional network traffic feature selection method.