介绍索引系统的基本结构以及经典查询处理方式DAAT和TAAT,给出在AND和OR两种布尔查询下的查询处理算法实现细节。分析结果表明,在海量索引规模查询的情况下,DAAT索引遍历方式要优于TAAT索引遍历方式,OR查询和AND查询的性能差距进一步加大,基于TREC WT2G和GOV2的多组实验验证了分析的结论。指出下一步在海量索引规模下搜索引擎查询处理研究的方向。
A brief overview of index structure and the state-of-the-art query processing strategies were given,i.e.DAAT(document-at-a-time)and TAAT(term-at-a-time).An explicit implementation of the two strategies of AND and OR operators was presented.The analytic conclusions show that operator OR is extremely slower than operator AND and DAAT is more efficient than TAAT,especially for large indexes.The experimental results on TREC WT2 Gand GOV2datasets verified the analytic conclusions.Finally,the future study of query processing based on large scale of indexes was presented.