如何设计有效的相关性排序函数是信息检索研究的一个核心问题,因为排序函数直接影响着搜索结果的质量。排序函数的好坏一般由信息检索评价方法进行评估,对其进行优化的主要困难是这些方法都依赖于结果文档的排序位置,因此对于查询的结果返回列表中相关文档的位置的研究是十分重要的。通过探索相关文档和不相关文档之间的偏序关系构造新的输入样本;该样本是由一个相关文档和一组不相关文档所构成的,它能够更加有效的区分文档的相关性;基于该输入样本,通过定义位置损失函数对排序结果进行优化。在公开数据集Letor3.0的上的实验结果显示该方法可以将多种排序评价方法的准确率平均提高2%,证明了所提出的方法的有效性。
Designing effective ranking functions is a core problem for information retrieval since the ranking functions directly impacted the relevance of the search results.Learning ranking functions from preference data in particular have recently attracted much interest.The ranking algorithms were often evaluated using information retrieval measures.The main difficulty in direct optimization of these measures was that they depended on the ranks of documents.So it was important to optimize the ranking positions of relevant documents in the result list.Specifically,the roles of preference were investigated between the relevant documents and irrelevant documents in the learning process.To remedy this,a new input sample named one-group sample was constructed by a relevant document and a group of irrelevant documents according to a given query.The new sample could effectively distinguish the relevance of documents.With the new samples a new position based loss function was also developed to improve the performance of learned ranking functions.Experimental studies were conducted using the Letor3.0 data set which improved ranking accuracies by 2% and demonstrated the effectiveness of the proposed method.