近似字符串匹配是模式匹配研究领域中的一个重要研究方向。压缩后缀数组是字符串匹配、数据压缩等领域广泛使用的索引结构,具有检索速度快和适用广泛的优点。利用压缩后缀数组,提出了适合近似字符串匹配搜索算法的数据结构,并在此基础上提出了一种匹配搜索算法。实验结果表明,相对于现有的算法,提出的算法在小字母表的情况下具有计算优势。
Approximate string matching is an important issue in the research area of pattern matching. Compressed suffix array is an index structure widely used in string matching and data compression, and it has the advantage of fast retrieval and can be widely applied. In this paper, it proposes a data structure suitable for approximate string matching searching algo- rithm, and based on the structure, it proposes a matching search algorithm. The result of the experiment shows that com- pared to the current algorithms, the algorithm proposed in this paper has computing advantage when the small alphabet exists.