为降低自动机类多模匹配算法的空间开销,同时仍保持较低的算法时间复杂度,提出了一种基于位图的空间优化算法.将自动机全部状态按照字典树结构的层数划分,将访问频率较低的后若干层状态对应的转移表压缩存储,并使用位图提高对被压缩信息的检索速度.经过实验和在实际应用环境中的验证,这种改进算法能够大幅降低空间开销,而匹配时间或响应时间基本不变.在模式串的数量达到万条以上规模时,实验表明优化算法能够降低25%~70%的空间消耗.
In order to reduce the memory consumption of automata algorithms in the field of multi-pattern matching, this paper proposes an efficient space optimization algorithm based on bitmap. It divides all the states in the automata into two groups by their depth in the data structure"trie", and reduces the memory consumption of the deeper group that will be retrieved less. It also makes use of bitmap to improve the time efficiency of the deeper group. Our experiments show that this algorithm can sharply reduce the memory consumption, and when the number of patterns is more than the level of 10 thousand, the space consumption can be decreased by 25% -70%.