当两类样本分布存在差异时,最接近支持向量机(Proximal Support Vector Machine,PSVM)等最小二乘类分类器分类结果将出现偏差,不能实现最小错误率分类。本文在分析PSVM等价广义特征值分解模型基础上,提出了一种改善原PSVM分类决策面的优化样本分布PSVM,其基本思想是通过引入最大化正确分类样本距决策面距离,同时最小化错误分类样本距决策面距离的优化样本分布正则化项,构造优化样本分布PSVM的广义特征值分解模型。通过人工数据集和UCI数据集的10个数据子集上的对比实验,验证了该改进分类模型能够有效调整决策边界,从而获得更好的分类效果。
When the distributions of 2 class samples are different,the classification results will be biased by using least square classifiers,such as proximal support vector machine (PSVM).Inevitably,this decision bias will cause non-minimal classification er-ror rates .In the present paper,based on equivalent generalized eigenvalue decomposition model of PSVM,a novel optimizing sam-ples distribution PSVM model is proposed,which can improve original PSVM decision .The model is constructed as a generalized eigenvalue decomposition model and contains an optimal samples distribution regularization item .It can maximize distances between correctly classified samples and decision boundary and minimize distances between misclassified samples and decision boundary .Ex-perimental results under artificial datasets and 10 data subsets from UCI datasets show that using this novel model can adjust decision effectively and achieve better classification effects .