检测超连接主机是网络安全巾的重要问题.而流抽样是高速网络环境下解决该问题的基础.现有解决方案使用基于哈希流抽样算法,其基本假设是存在均匀随机哈希函数.但是已有研究并没有评价此假设的合理性.该文通过技术分析和实验测试得出结论:在2.5Gbps以上高速网络中,以上假设在线性流ID序列情况下并不合理.随后,该文基于Bloom filter数据结构提出一种新的流抽样算法.算法分析表明:新算法具有10Gbps线速处理能力和较小的空间复杂度.最后,该文基于实际互联网数据进行实验评价,结果显示:新算法能够实现独立于流ID的等概率随机抽样.
Detecting super-connection hosts is an important issue in network security and flow sampling is the key to solve this problem in high speed networks. The existing solutions use hash-based flow sampling algorithm, which assumes that the uniform random hash functions are available. However, this assumption can not be justified. By technical analysis and experiment tests, this paper concludes that the assumption is not true for linear flow IDs in high speed networks (above 2.5Gbps).A new flow sampling algorithm is presented subsequently, which exploits the Bloom Filter data structure. An analysis demonstrates that the new algorithm can support the 10Gbps line-speed processing with low space complexity. Experiments are also conducted based on real network traces. Results show that the proposed algorithm can achieve equal probability flow sampling independent of flow ID distribution.