位置:成果数据库 > 期刊 > 期刊详情页
基于上下文的查询扩展
  • 期刊名称:计算机研究与发展
  • 时间:0
  • 页码:300-304
  • 语言:中文
  • 分类:TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]昆明理工大学云南省计算机应用重点实验室,650051, [2]哈尔滨工业大学语言语音教育部-微软重点实验室,150001
  • 相关基金:国家自然科学基金重点项目(60736044);国家“八六三”高技术研究发展计划基金项目(2006AA01Z150);云南省应用基础研究面上项目(2009ZC032M)
  • 相关项目:下一代信息检索研究
中文摘要:

针对信息检索查询所使用的词可能与文档集中使用的词不匹配从而影响检索效果这一信息检索关键问题,提出了一种基于上下文的查询扩展方法,该方法根据查询的上下文信息对扩展词进行选择,同时考虑到扩展词与整个查询句以及与查询词的位置关系.在TREC信息检索测试集上进行的实验表明,相对于通常简单的语言模型,方法取得了5%~19%的提高.与流行的基于伪反馈的查询扩展方法相比,提出的方法也具有相当的平均准确率.

英文摘要:

The effectiveness of information retrieval fIR) systems is influenced by the degree of term overlap between user queries and relevant documents. Query-document term mismatch, whether partial or total, is a fact that must be dealt with by IR systems, query expansion (QE) is one method for dealing with term mismatch. Classical query expansion techniques such as the local context analysis make use of term co-occurrence statistics to incorporate additional contextual terms for enhancing passage retrieval. However, relevant contextual terms do not always co-occur frequently with the query terms and vice versa. Hence the use of such methods often brings in noise, which leads to reduced precision. On the basis of analyzing the process of producing query, the authors propose a new method of query expansion on the basis of context and global information. At the same time, the expansion terms are selected according to their relation with the whole query. Additionally, the position information between terms is considered. The experiment result on TREC data collection shows that the method proposed outperforms the language model without expansion by 5%-19%. Compared with the popular approach of query expansion, pseudo feedback, the method has the competitive average precision.

同期刊论文项目
期刊论文 117 会议论文 76 专利 12 著作 3
同项目期刊论文