排序是信息检索中一个重要的环节,当今已经提出百余种用于构建排序函数的特征,如何利用这些特征构建更有效的排序函数成为当今的一个热点问题,因此排序学习(Learningto Rank),一个信息检索与机器学习的交叉学科,越来越受到人们的重视。从排序特征的构建方式易知,特征之间并不是完全独立的,然而现有的排序学习方法的研究,很少在特征分析的基础上,从特征重组与选择的角度,来构建更有效的排序函数。针对这一问题,提出如下的模型框架:对构建排序函数的特征集合进行分析,然后重组与选择,利用排序学习方法学习排序函数。基于这一框架,提出四种特征处理的算法:基于主成分分析的特征重组方法、基于MAP、前向选择和排序学习算法隐含的特征选择。实验结果显示,经过特征处理后,利用排序学习算法构建的排序函数,一般优于原始的排序函数。
Ranking is an essential part of information retrieval.Nowadays there are hundreds of features for constructing ranking functions and it is a hot research topic that how to use these features to construct more efficient ranking functions.So learning to rank,an interdisciplinary field of information retrieval and machine learning,has attracted increasing attention.The construction methods of ranking features show that the features are not independent from each other.However, the state-of-the-art learning to rank approaches merely analyze the features from the aspects of feature recombination and selection for constructing more efficient ranking functions.In this paper,the model structure is proposed.Firstly the features are analysed for constructing the ranking functions.Secondly the features are recombined and selected, and finally ranking functions are learnt through learning to rank methods.And four methods are proposed based on this structure: feature recombination based on principal component analysis, feature selection based on MAP, forward selection and feature selection implied by learning to rank methods.The experimental results show that ranking functions learned through learning to rank methods based on the feature analysis methods outperform the original ones.