排序学习算法作为信息检索与机器学习的一个交叉领域,越来越受到人们的重视.然而,几乎没有排序学习算法考虑到查询差异的存在.文中查询被建模为多元高斯分布,KL距离被用来度量查询之间的距离,利用谱聚类方法对查询进行聚类,为每个聚类类别训练一个排序函数.实验结果表明经过聚类得到的排序函数需要较少的训练样例,但是它的性能却和没有经过聚类得到的排序函数具有可比性,甚至优于后者.
Learning to rank,the interdisciplinary field of information retrieval and machine learning,draws increasing attention and lots of models are designed to optimize the ranking functions.However,few methods take the differences among the queries into account.In this paper,the queries are modeled as multivariate Gaussian distributions and Kullback-Leibler divergence is adopted as distance measure.The spectral clustering is applied to cluster the queries into several clusters and a ranking function is learned for each cluster.The experimental results show that the ranking functions with clustering are trained with less data,but are comparable to or even outperform the ones without clustering.