讨论了基于XML文档的关键字查询技术,根据XML文档结构的特点,结合关键字查询的需求,提出查询文档中主题的概念;建立XML文档的主题索引,设计出基于主题的高效查询算法.该算法依据主题索引和输入的关键字判断用户的查询主题,再根据主题执行获取最终查询的结果.在查询过程中不仅排除了查询主题不相关的关键字节点,同时也避免生成不相关的查询结果,提高了查询效率和质量.实验结果证明了该算法在绝大多数情况下的高效性.
Keyword search over XML documents discussed. In according with the feature of XML documents and the information need, the theme concept of the XML document is proposed. We designed an index of theme and propose an efficient query algorithm basing on the theme. Before getting the final result, we pre-judged the theme of the search, which can not only avoid producing irrelevant resuits but also help us exclude the irrelevant keywords in keyword inverted lists during the process of search. Finally we report the results of the experimental study.