模型选择问题是支持向量机的基本问题.基于核矩阵近似计算和正则化路径,提出一个新的支持向量机模型选择方法.首先,发展初步的近似模型选择理论,包括给出核矩阵近似算法KMA-α,证明KMA-α的近似误差界定理,进而得到支持向量机的模型近似误差界.然后,提出近似模型选择算法AMSRP.该算法应用KMA-α计算的核矩阵的低秩近似来提高支持向量机求解的效率,同时应用正则化路径算法来提高惩罚因子C参数调节的效率.最后,通过标准数据集上的对比实验,验证了AMSRP的可行性和计算效率.实验结果显示,AMSRP可在保证测试集准确率的前提下,显著地提高支持向量机模型选择的效率.理论分析与实验结果表明,AMSRP是一合理、高效的模型选择算法.
Model selection is an indispensable step to guarantee the generalization of support vector machines (SVM). The main problem of existing SVM model selection approaches is that a standard SVM needs to be solved with high complexity for each iteration. In this paper, a novel model selection approach for SVM via kernel matrix approximation and regularization path is proposed, based on the observation that approximate computation is sufficient for model selection. Firstly, a kernel matrix approximation algorithm KMA-a is presented and its matrix approximation error bound is analyzed. Then, an upper model approximation error bound is derived via the error bound of KMA-a. Under the guarantee of these approximation error bounds, an approximate model selection algorithm AMSRP is proposed. AMSRP applies KMA-a to compute a low-rank approximation of the kernel matrix that can be used to efficiently solve the quadratic programming of SVM, and further utilizes the regularization path algorithm to efficiently tune the penalty factor C. Finally, the feasibility and efficiency of AMSRP is verified on benchmark datasets. Experimental results show that AMSRP can significantly improve the efficiency of model selection for SVM, and meanwhile guarantee the test set accuracy. Theoretical and experimental results demonstrate that AMSRP is a feasible and efficient model selection algorithm.