由于支持向量机具有较好地学习性能和泛化能力,目前已经得到了广泛的应用。如何使支持向量机进行有效的增量学习是目前支持向量机应用中需要解决的问题。深入研究了支持向量分布特点,提出了一种新的支持向量机增量训练淘汰机制——距离比值算法。该算法根据遗忘规则,设定一个合适的参数,按距离比值法中的定义计算各个样本中心距离与其到最优分类面距离的比值,舍弃对后续训练影响不大的样本,即可对训练数据进行有效的淘汰。对标准数据集的实验结果表明,使用该方法进行增量训练在保证分类精度的同时,能有效地提高训练速度。
Due to the good learning and generalization performance, the SVM (support vector machine) has been widely used in practice. But, how to make the SVM more effectively perform incremental learning is a problem that needs to be solved in the present application of the SVM. The distribution characteristics of Support vectors are studied and a novel improved incremental SVM learning algorithm - distance ratio algorithm is proposed. According to the removing rules method, an appropriate parameter is set and samples that have less effect on later training are abandoned. According to the definition in distance ratio algorithm, the ratio between the center distance of each sample and the distance of each to the optimum classification surface is calculated. In this way, the training data sets can be effectively reduced. Experiment on standard data sets shows that by using this method the classification accuracy can be guaranteed and the training speed can be effectively improved.