与传统数据库对XML数据的处理不同,对XML数据流的处理不仅受实时性的约束,还受存储空间的限制.在XML片段无序传送的广播模型中,考虑在XML数据流上进行高效的关键字查询,进而首次提出近似SLCA算法.SLCA算法利用结构Hash表和LCA表对关键字进行匹配并计算SLCA,从而避免冗余操作.同时,SLCA算法可以对匹配结果立即输出而不必等到数据流传输结束.实验结果表明,基于Hole-Filler模型的XML数据流上的SLCA算法在节省时间和空间开销方面均表现出较好的性能.
Unlike in traditional databases,queries on XML streams are bounded not only by memory but also by real time processing.A novel technique for keyword search over streamed XML fragments is presented,which adopts broadcast model and hole-filler model for XML fragments dissemination,addressing the problem of disordered fragment transmission and considering the quality of searching results due to either keyword mismatch or data absence.Two efficient indexes for candidate elements are developed to further improve the performance:Hierarchical hash table and LCA table.The former indexes structure keywords which act as the structure of result,while the latter indexes the condition keywords which refine the keyword search condition.SLCA computing algorithm,which is triggered by condition keywords,only computes the candidate fragments that involve keywords,thus avoiding redundant operations that will not contribute to the final result.The algorithm produces part of the matched answers continuously without having to wait for the end of the stream.The experiments evaluate the performance of the SLCA algorithm with different types of keywords,different document fragmentation and different keyword frequencies,and compare the SLCA algorithm with other XML keyword matching algorithms.The experiment study shows that the SLCA algorithm performs well on saving processing power and memory space.