应用统计方法对支持向量机方法中核函数选择问题进行了研究。文中将“纠正重复取样t测试”引入到核函数选择中,通过其与k-折交叉验证、配对t测试等多种统计方法的综合应用,对9个常用核函数的分类能力进行了定量研究。同时,文中还提出了基于信息增益的评估核函数模式识别能力的定量评估准则,证明了该准则是传统评估准则的非线性函数。数值实验表明,不同模型评估准则之间存在差异,但应用统计方法可以从这些差异中发现一些规律。同时,不同统计方法之间也存在显著差异,且这种差异对模型评估的影响要大于由于评估准则的不同而产生的影响。因此,只有应用综合的评估方法和准则才能对不同核函数的分类能力进行客观评估。
This paper explores the research on evaluating kernel classification performance usmg statistical methods. By employing the corrected resample t-test and other two statistical methods-k-fold cross-validation and paired t-test, this paper compares their classification abilities on nine normally used kernels. In addition, a new quantitative criterion of evaluating kernel classification performance based on information gain is proposed, which is proved to be the nonlinear function of traditional criteria. Benchmark tests show that there is difference among different criteria, but by using statistical methods some regulations can be turned up among them. Simultaneously, there is great difference among different statistical methods, which affects the evaluating results more than the difference among different criteria does. So only with the integrated methods and criteria the classification performance of different kernels can be evaluated objectively.