针对Winnowing分块算法存在的分块效率较低以及分块粒度相对较粗等问题,提出了一种基于两级分块的文件同步方法(double-chunking file synchronization,DF-RSYNC)。该方法采用循环队列对每一个固定窗口内滑动块的局部字节指纹值进行存储,以避免对重叠部分滑块指纹值的重复计算,并采用分块粒度由粗到细的两级分块、两轮往返的同步算法,以提高差异检测的准确率,减少差异数据量。实验结果表明,该方法能够有效减少分块时间,提高差异计算的效率;能够更细粒度地检测到文件差异,从而提高了检重率。
Since the general winnowing chunking method is time-consuming and coarse-grained, a double-chunking file synchroni- zation (DF-RSYNC) method is proposed. A cycle queue data structure is used to store the results of local byte fingerprint ex- treme in each fixed window for the sake of avoiding recalculation the overlapped part, besides, a coarse to fine double-chunking and Two Round-trip method is used to improve the accuracy of difference calculation and reduce amount of the different data. Ex- perimental results demonstrate that the time of chunking is reduced efficiently, which improves the efficiency of difference calcu- lation and detect difference with finer granularity and higher detect rate than traditional algorithm.