针对基于机器学习的问题分类中问句特征的组合,提出了一种基于重要性和抑制性分析(importance—inhibition analysis,IIA)的特征组合方法.该方法在组合问句特征时不仅考虑了单个特征本身的重要性,还考虑了待组合特征之间的抑制性.在中文问题集上的实验结果表明,IIA方法在所有的特征组合上都获得了平均精度和最高精度的提升,总体上比单纯基于重要性分析(importance analysis,IA)的特征组合方法要更加高效;同时,IIA方法还获得了与穷举式特征组合方法同样的最高精度,进一步提升了当前中文问题分类的性能.
A new method for combining features via importance-inhibition analysis (IIA) is described to obtain more effective feature combination in learning question classification. Features are combined based on the inhibition among features as well as the importance of individual features. Experimental results on the Chinese questions set show that, the IIA method shows a gradual increase in average and maximum accuracies at all feature combinations, and achieves great improvement over the importance analysis(IA) method on the whole. Moreover, the IIA method achieves the same highest accuracy as the one by the exhaustive method, and further improves the performance of question classification.