通过分析AC多模式匹配算法和正则语句搜索匹配在功能上的优劣,研究它们在生成确定性有穷自动机时的相同与差异,融合AC算法和正则语句运用于文本的多模式串匹配,使得AC算法能够识别正则语句,并且保持原有算法在匹配失败后,目标模式串指针不回退且AC自动机回退少的特点,使得算法兼有二者优点.同时,讨论了在GPU上通过CUDA的并行程序环境实现算法的并行化,并详细比较了在GPU上利用不同类型存储器实现的算法的性能差异.
Multi-pattern matching algorithm has been widely used in text searching,intrusion detection,and some other areas.We focus on two matching algorithms,AC and regular expression.By comparing their DFA automata building process,we integrate the two processes to a new novel AC algorithm.The new AC maintains the advantages of the traditional AC when matching fails,the pointer of the target pattern does not turn back,and the AC automata pointer just moves back by a few steps.Meanwhile,we also discuss the AC parallelization based on GPU and CUDA,and compare the running performance when using GPU global memory or the texture one.