分析出影响FPGA实现的正则表达式匹配性能的关键因素是正则表达式匹配性能优化的前提.首先由L7-Filter各个规则的性能测试结果分析出低主频规则有别于其它高主频规则的三个特征.其后通过设计多个字符组串联而成的特殊正则表达式测试模型去验证这三个特征对基于FPGA的正则表达式自动机性能的影响程度.得出如下结论:基于FPGA的正则表达式自动机的主频随字符组宽度的增长而迅速下降,随字符组串联数目的增长而缓慢下降;星号(*)或问号(?)重复语法对字符组规则主频的影响大于加号(+)重复语法对字符组规则主频的影响.最后将基于字符组的结论推广至更普遍的大量字符"或(|)"操作的层面.
In order to optimize the performance of regular expression matching,key factors impacting on the performance of regular expression matching have to be found out at first.Three key characteristics were concluded by comparing and analyzing the performance testing results of L7-filter rules.Then specific regular expression test models of multiple-cascaded character classes were designed to verify the three characteristics' influence degree to the performance of regular expression circuits.Conclusions were drawn from these experiments,which are the frequency of regular expression circuits declined rapidly with the growth of the width of the character class and declined slowly with the growth of the number of character classes cascade.And the impact of asterisk(*) or question(?) repetition syntax is greater than the plus(+) repetition syntax on the frequency of regular expression circuits.Finally,the conclusions based on the character class were inferred to a more general case of a large number of 'OR(|)' operators.