支持向量分类时,由于样本分布的不均匀性,单宽度的高斯核会在空间的稠密区域产生过学习现象,在稀疏区域产生欠学习现象,即存在局部风险.针对于此,构造了一个全局性次核来降低高斯核产生的局部风险.形成的混合核称为主次核.利用幂级数构造性地给出并证明了主次核的正定性条件,进一步提出了基于遗传算法的两阶段模型选择算法来优化主次核的参数.实验验证了主次核和模型选择算法的优越性.
In classification by support vector machines with the Gaussian kernel, the kernel width defines the generalization scale in the pattern space or in the feature space. However, the Gaussian kernel with constant width is not well adaptive everywhere in the pattern space since the patterns are not evenly distributed. That is, the over-fitting learning will appear in the dense areas and otherwise the under'fitting learning in the sparse areas. To reduce such local risks, a secondary kernel with global character is introduced for the Gaussian kernel. Here the Gaussian kernel is regarded as the primary kernel. The constructed hybrid kernel is called the primary-secondary kernel (PSK). The positive definiteness of PSK with given constraints is proved by virtue of the power series. For support vector machines with PSK, the two-stage model selection based on genetic algorithms is proposed to tune the model parameters. That is, the algorithms firstly tune the model parameters with Gaussian kernel. Then the model parameters with the Gaussian kernel keep unchanged and the model parameters with the secondary kernel are further tuned. The two-stage model selection algorithms aim to overcome the problem of the optimization tendency embodied in the optimization algorithms. For the support vector machines with multiple parameters, the optimization tendency often causes the failure of the model selection. Finally, the experiments demonstrate that PSK performs better than the Gaussian kernel and also validate the efficiency of the proposed model selection algorithms.