研究了信息检索结果多样化的方法。首先实现了经典的检索结果重排序最大边缘相关(MMR)算法,进而设计了三种基于查询日志的子查询构造方法:单字向后扩展,双字向后扩展和双向子串扩展,并最终探讨了将这三种子查询构造方法分别与最大边缘相关算法相结合的使用策略。实验表明,采用上述方法实现的系统能明显提高信息检索结果的多样性。
This paper studies diversity of information retrieval results. First, it implements the classic Maximal Marginal Relevance (MMR) algorithm to re-rank the initial retrieved results. Then it designs and implements three sub-topie construction methods based on query logs, Finally, it explores a method of combining the sub-topics with the Maximal Marginal Relevance algorithm to provide better diversified results. Experiments indicate these methods increase the diversity of the results significantly.