针对排序学习中如何选择最值得标注的样本和通过尽可能少的已标注样本训练出较好的排序模型的问题,将主动学习的思想引入排序学习中,提出一种基于排序感知机的主动排序学习算法——ActivePRank。基于真实数据集的实验结果表明,该算法在保证排序模型性能的前提下,减少样本的标注量,在同等标注量的条件下,提高排序结果的正确率。
This paper focuses on how to find out the most useful data to label and how to learn a more plausible ranking model with a smaller set of labeled data in learning to ranking field. It brings the idea of active learning into ranking problem, and proposes an active ranking algorithm based on PRank to reduce the labeling cost. Experimental results on real-world dataset show that the algorithm can reduce the labeling cost without decreasing the ranking accuracy.