针对可扩展标记语言(XML)基本查询操作符——包含连接,提出了一种基于权重哈尔小波的结果数估计方法.该方法利用哈尔小波有效压缩XML包含连接结果统计,并通过小波摘要维护统计信息.在估计阶段,使用小波系数重构包含连接结果数.为了减小估计误差,提出基于标签名查询频率的权重模型,并集成于哈尔小波估计方法中.实验证明,对于XML包含连接结果数估计,权重哈尔小波估计方法优于先前的估计方法(如直方图法、随机取样法).在相同的空间限制下,权重小波估计具有更小的平均相对误差.
A novel weighted Haar wavelet method was proposed to estimate the size of extensible markup language (XML) containment join that is the basic operation in XML structural query processing. The method efficiently compressed the statistic of XML containment join size by the Haar wavelet. The statistic was maintained in the wavelet synopsis. XML containment join size was computed by the wavelet coefficient reconstruction during XML estimation. A novel weight model was presented based on the query frequency of XML tag name to reduce estimation error. The weight model was integrated into the Haar wavelet method. The experimental results show that the method outperforms previous join estimation methods, e. g. , histogram-based means, sample-based means. The method has smaller mean relative error than previous methods under the same space budget.