XML已经广泛的应用于多个领域。基于关键字检索的搜索引擎在商业上获得了巨大的成功。基于相关性进行XML信息检索,将相关性高的结果排在靠前的位置,直接关系到检索质量和用户的满意度。现有的ALCA算法效率较高,但未基于相关性对结果进行排序。在该算法的基础上增加相关性排序方法,先按根结点中是否包含关键字将所有结果分成相关性不同的两个等级,然后再分别对两个等级的结果排序。结果片段与用户信息需求的相关性是由其中的元素、属性和文本结点的总贡献决定的。实验结果表明改进后的算法取得了较好的排序有效性。
XML is popular in various applications. And commercial search engines have gained great success. It's necessary to research on keyword based XML information retrieval. Good relevance function can help to improve search quality. ALCA is of high efficiency,but lacks relevance sorting. All LCAs are divided into two classes with different relevance value according to whether the roots of result fragments contain any keyword. And then every element in the two classes is sorted using the proposed ranking function. The relevance between result fragments and user information need is the sum of the contributions of the element,attribution and text nodes in the fragments. Experiment shows the adapted algorithm achieves good ranking effectiveness.