特征选择是影响问答系统中问题分类的重要因素。本文充分利用汉语框架网在语义表达方面的特点,提出一种面向问题分类的强类别信息词(SCIW)特征选择方法。首先选择五种汉语框架网特征作为候选特征,然后采用SCIW特征选择方法,根据每一类别的分类精度对单个特征的分类能力进行排序,并通过特征组合实验,选出具有最好分类效果的组合特征,达到特征约简的效果。
Feature selection is the important factor which affects the question classification of question answering system.By fully using the characteristics of Chinese FrameNet in terms of semantic expression,this paper presents a new question classification-oriented approach in feature selection called strong class information words(SCIW).Firstly,it selects five kinds of Chinese FrameNet features as candidate features,and then uses SCIW to select features.According to each category's classification precision of features,it sorts the classification ability of each single feature.Through the experiment of combinations of features,it selects the combination of features,which has better classification results.So the feature reduction can be reached.