关键字搜索是大多数普通用户搜索信息的有效手段,因为他们不需要学习复杂的查询语言,也不需要了解底层数据的结构.本文研究了针对XML文档的关键字搜索问题,首先指出前人基于SLCA的结果集定义的不完备性,进而提出基于XLCA的结果集定义,使得其能够包含所有可能的结果.基于这样的结果集定义,给出了一种精简的索引结构以及相应的搜索算法,并实现了这两种不同的方法,实验证明本文提出的方法在性能以及可扩展性方面均有较大的提高.
Keyword search is an effective approach for most users to search for information because they do not need to learn complex query languages, or know the underlying structures of the data. This paper focuses on keyword search in XML desuments. It first points out the definition of the result set based on SLCA is not complete and then defines the result set of XML keyword search based on XLCA, which can include all the possible results. Based on such definition, it presents a compact Index structure and the corresponding search algorithm. Two search methods'have been implemented and the experiments demonstrate the benefits of our method over previously proposed methods.