通过ISE准则逼近真实密度差的L2.核分类器没有显式地考虑到分类间隔,在一定程度上不利于提高分类器精度I同时,权向量的求解最终转化为一个二次规划问题,导致L2.核分类器训练速度较慢,特别是对于较大样本.基于这两个问题,利用样本间的密度差构造了分类间隔并最大化此间隔,而此问题最终转化为一个对数优化问题,故称其为最大间隔对数向量机(maximum margin logistic vector machinet,简称MMLVM),进而利用梯度下降法求解最优权.同时,分别从权的全局最优性、一般化误差界及算法复杂度这3方面进行了理论分析.最后,人工和UCI,PIE及USPS数据集的实验结果表明,算法理论正确,解决了上述两个问题并获得了较好的效果.
The L2-kemel classifier does not consider explicitly its classification margin when approximating the difference of densities (DoD) with the integrated squared error (ISE) criterion of probability densities, which is disadvantageous for improving the performance of classifiers to a certain extent. Its weights can simply be obtained by solving the corresponding QP problem which results in the comparatively slow training speed and is impractical especially for large datasets. With the aim of overcoming the above drawbacks, a new classification method is proposed in this paper, called the maximum margin logistic vector machine (MMLVM), which maximizes the DoD- based classification margin and finds the corresponding weight vector by solving a logistic optimization problem in gradient descent way. The theoretical analysis is provided in the globally optimal weights, the generalization error bound, and in the computational complexity of MMLVM. Experimental results on the artificial, UCI, PIE and USPS data sets demonstrate the effectiveness of the proposed approach in overcoming the drawbacks as above.