为减少软件更新中增量包的大小,提出了基于动态字典的增量更新算法DICDIFF。该算法使用后缀排序方法预处理新老版本文件,将预处理的结果以后缀数组和名次数组的形式存储为字典目录,基于该字典目录能够快速查找字典数据集与待编码数据之间的相同数据段。随着编码进度的推进,动态扩展字典数据集能够使用更多已知数据段用于构造待编码数据。通过选取多款软件的新旧版本作为实验样本,DICDIFF在平均情况下能够节省68.9%的网络流量,高于现有其他增量更新算法。实验结果表明,该算法能够进一步减少增量更新过程中的网络流量。
To reduce the size of delta file for incremental updates,an incremental update algorithm was proposed based on dynamic dictionary named differential compression based on dynamic dictionary(DICDIFF).This algorithm uses the suffix sorting algorithm to preprocess the old and new versions.The pretreatments result in the forms of suffix array and rank array which are stored in the dictionary catalogue.Based on this dictionary catalogue,DICDIFF finds the identical segments between the dictionary data set and the data set to be encoded quickly.With the progress of encoding,DICDIFF is expanding the dictionary data set dynamically.This algorithm gets more segments which could be copied from the dictionary data set to the new version file.Experiments use software from different operating systems as samples,and the average data transmission saving is as high as 69.3%.In comparison with the existing methods,DICDIFF constructs the minimal delta file.The experimental results show that DICDIFF can reduce the network traffic for incremental updates.