目前,查询性能预测(predicting query performance,简称PQP)已经被认为是检索系统最重要的功能之一.近几年的研究和实验表明,PQP技术在文本检索领域有着广阔的发展前景和拓展空间.对文本检索中的PQP进行综述,重点论述其主要方法和关键技术.首先介绍了常用的实验语料和评价体系;然后介绍了影响查询性能的各方面因素;之后,按照基于检索前和检索后的分类体系概述了目前主要的PQP方法;简介了PQP在几个方面的应用;最后讨论了PQP所面临的一些挑战.
Predicting query performance (PQP) has recently been recognized by the IR (information retrieval) community as an important capability for IR systems. In recent years, research work carried out by many groups has confirmed that predicting query performance is a good method to figure out the robustness problem of the IR system and useful to give feedback to users, search engines and database creators. In this paper, the basic predicting query performance approaches for text retrieval are surveyed. The data for experiments and the methods for evaluation are introduced, the contributions of different factors to overall retrieval variability across queries are presented, the main PQP approaches are described from Pre-Retrieval to Post-Retrieval aspects, and some applications of PQP are presented. Finally, several primary challenges and open issues in PQP are summarized.