由于自然语言本身的歧义性和多样性,少数几个关键词难以表达真实的信息需求。查询扩展技术通过挖掘原始查询项的潜在信息,有效地增强了检索系统的理解能力。该文在上下文分析方法计算公式中加入了句子权重概念,即假设由原始查询项返回的句子越重要,则其中出现的词与查询项越相关。同时进一步假设,句中的词与查询项的位置关系与依赖关系也是选取扩展词的重要依据。为此,该文分别提出基于句子权重与位置上下文分析方法(Sentence Weight&Position-based Context Analysis,SWPCA),以及基于句子权重与依赖关系上下文分析方法(Sentence Weight&Dependency-based Context Analysis,SWDCA)。并将这两种查询扩展技术应用于TREC的定义类问题回答,数据显示这两种方法均取得不错成绩,而SWDCA性能更好。
To express the true intention is more difficult only by a few keywords due to ambiguity and diversity of natural language.Query expansion effectively enhances the understanding of the retrieval system by trying to dig the potential meaning of the original query.Assuming that the words in the returned sentence by the original query are more important as the sentence with high score,sentence weight is applied to calculate candidate expansion items for local context analysis.In the same time,the paper further assume that the candidate words will tied closer with the originally query if they have some position or dependency relationships.So two relation-based query expansion methods are putted forward,the first is Sentence Weight Position-based Context Analysis,called SWPCA.And the second is Sentence Weight Dependency-based Context Analysis,called SWDCA.Finally the two methods are used for the definitional question answering of TREC.The experiment data show that both methods are efficient,and SWDCA performs is a little better than SWPCA.