查询扩展是提高检索效率的有效方法.但是许多查询扩展方法中扩展词的选择没有充分考虑词项之间以及词项与文档之间的相关性,这样可能在查询扩展时加入太多不相关信息降低检索的性能.通过对文档间相关性和词间相关性的计算,把文档和词关联起来构建Markov网络检索模型,然后根据词项子空间和文档子空间的映射关系提取词团,将提取的词团信息用于查询扩展,使得查询扩展的内容更为相关.实验表明:基于文档团依赖的Markov检索模型能有效地提高检索效果.
The query expansion is an effective way to improve the efficiency of information retrieval. But many of the query expansion methods to select the expansion terms did not take fully account of the correlation between the terms as well as terms and documents, which may reduce retrieval performance. Due to the information of the correlation between terms and documents is able to improve the efficiency of retrieval, this paper calculates the correlation between documents and terms, and mapping terms to documents to build a Markov network retrieval model; and then extracts term clique according to the mapping information. The mappting information is used to divide the term cliques into two categories. One is based on document and another is not based on document. The terms cliques based on document are more relevant with the query topic, so to the terms cliques are given greater weight based on document and the information of the two kinds of terms cliques is used to assist retrieval. Therefore, the method we propose in this paper can make the extension content more relevant to query. Experimental results show the proposed model can improve the retrieval efficiency.