为了弥补支持向量机对非均衡样本集分类时倾向于较大类的不足,提出一种平衡策略。基于Fisher判别思想,计算出两类样本在分类超平面法向量上投影后的均值和方差,再依据两类错分概率相等准则,给出新的阈值计算方法对超平面进行调整。该方法可补偿非平衡数据分类的倾向性,提高预测分类精度。最后在非均衡的人工和真实数据集上的数值实验表明了该方法的可行性与有效性。
Since support vector machine is unfair to the rare class for the classification of imbalanced data, proposed an adjustment method of the separating hyperplane. Based on Fisher discrimination, got the projected class mean and variance are by projecting two classes samples onto the normal vector of the separating hyperplane, then adjusted the threshold of the hyperplane, according to the principle that error probability of two classes are equal. The proposed algorithm could compensate the ill-effect of tendency and improved the accuracy. Simulations on imbalanced artificial and real data show that the feasibility and validity of the proposed method.