多模式匹配是基于内容检测的网络安全系统的重要功能,同时,它在很多领域具有广泛的应用.实际应用中,高速且性能稳定的大规模模式匹配方法需求迫切,尤其是能够在线实时处理网络包的匹配体系结构.介绍了一种存储有效的高速大规模模式匹配算法及相关体系结构.研究从算法所基于的理论入手,提出了缓存状态机模型,并结合状态机中转换规则分类,提出了交叉转换规则动态生成的匹配算法ACC(Aho.Corasick.CDFA).该算法通过动态生成转换规则降低了生成状态机的规模,适用于大规模模式集.进一步提出了基于该算法的体系结构设计.采用网络安全系统中真实模式集进行的实验结果表明,该算法相比其他状态机类模式匹配算法,可以进一步减少80%-95%的状态机规模,存储空间降低40.7%,存储效率提高近2倍’算法单硬件结构实现可以达到11Gbps的匹配速度.
Pattern matching is the main part of content inspection based network security systems, and it is widely used in many other applications. In practice, pattern matching methods for large scale sets with stable performance are in great demand, especially the architecture for online real-time processing. This paper presents a memory efficient pattern matching algorithm and architecture for a large scale set. This paper first proposes cached deterministic finite automata, namely CDFA, in the view of basic theory. By classifying transitions in pattern DFA, a new algorithm, ACC, based on CDFA is addressed. This algorithm can dynamically generate cross transitions and save most of memory resources, so that it can support large scale pattern set. Further, an architecture based on this method is proposed Experiments on real pattern sets show that the number of transition rules can be reduced 80%-95% than the current most optimized algorithms. At the same time, it can save 40.7% memory space, nearly 2 times memory efficiency. The corresponding architecture in single chip can achieve about 11Gbps matching performance.