问题分类是自动问答系统中关键技术之一,而问题中的关键词语是问题分类的重要依据。本文主要探讨问题词和中心词在问题分类中所起的作用,提出一种基于问题词和中心词的层次化结构问题分类器。分类器首先利用问题词将句子集分为三类,然后对于每个类别分别建立相应的分类器,对于what型问题,本文构造了基于关联规则的中心词分类器。本文实现的层次化结构分类器在TREC2007QA问题集和UIUC数据集上精度分别达到了90.6%和84.0%,充分显示了问题词和中心词在问题分类中至关重要的作用。
Question classification is one of the most crucial models in question answering system. And the key words play very important roles for question classification task. In this paper, we investigate the role of question word and head word in question classification. This paper proposed a novel hierarchical structure question classifier based on the question words and head words. Using question words, it first simple classified the question sentence into three categories. For each category, we designed an appropriate classifier respectively. As to the type of what questions, we constructed a head word based classifier using assassination rules. The novel hierarchical structure question clas- sifier is tested on the TREC2007 QA question set and the UIUC Dataset. It can get accuracy of 90.60/00, 84.0% respectively, which proved the importance of the question words and head words in the question classification.