东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

多类分类的支持向量机的有限步终止Newton算法

ISSN号：0254-3079
期刊名称：应用数学学报
时间：0
页码：717-725
语言：中文
分类：TP181[自动化与计算机技术—控制科学与工程;自动化与计算机技术—控制理论与控制工程] TP391[自动化与计算机技术—计算机应用技术;自动化与计算机技术—计算机科学与技术]
作者机构：[1]黑龙江八一农垦大学文理学院,大庆163319, [2]中国农业大学理学院,北京100083
相关基金：国家自然科学基金资助项目（70601033）和黑龙江省教育厅科学技术研究面上（11541259）资助项目
相关项目：基于优化新技术的支持向量机的模型与算法研究

作者：袁玉萍|钟萍|

关键词：多类分类问题, 支持向量机, 牛顿法, multi-class classification, support vector machine, Newton method

中文摘要：

多类分类问题是数据挖掘和机器学习领域中一个重要且正在进行研究的课题。最近对该问题提出了一种具有新型结构的K-SVCR方法。与其他方法相比较,此方法最大的优点在于在训练的过程中,能够利用训练数据的所有信息。然而,它又和“一对一”方法一样,对某一个K类分类问题,需要求解K（K-1）/2个二次规划问题,才能把一个模式指派到一个适当的类别中。因此建立一个快速有效的训练算法是非常重要的。在本文中,我们首先在K-SVCR方法的基础上提出了新的模型,然后把新模型转化成一个互补问题,并利用Lagrangian隐函数进一步转化成一个强凸的无约束优化问题。并且为它建立了一个快速地Newton算法。该算法具有全局收敛和有限步终止的性质。同时通过Sherman-Morriaon-Woodbury等式,将算法中需要处理的$l／timesl$矩阵（其中l是模式的总量）转变成$（n＋1）／times（n＋1）$的矩阵（其中n是模式的维数）。对于很多多类分类问题,n远远小于l,这也说明可以有效地实现该算法。初步的实验结果表明该算法在分类的准确度和训练速度方面都有很好的表现。

英文摘要：

Multi-class classification is an important and on-going research subject in data mining and machine learning. Recently, a new structural method named K-SVCR was proposed for tackling multi-class classification problems. Compared with others, the remarkable advantage of this method is that it allows all the information to be incorporated into the remaining training patterns when a multi-class problem is considered during the decomposition scheme. However, as the same as the most popular one-against-one method, for a K-class classification problem, K（K -1）/2 quadratic programs need to be solved to assign a new pattern to a proper class. So it is extremely important to develop an algorithm that can solve these quadratic programs efficiently. In this paper, we start with a new formulation which is proposed based on the K-SVCR method. We then transform it as a complementarity problem and further a strongly convex unconstrained optimization problem by using the implicit Lagrangian function. Then a fast Newton algorithm with global and finite termination properties is established for solving the resulting optimization problem. In addition, for many a multi-class classification problems, the total number of patterns 1 is much larger than the dimensionality of the pattern n. We exploit this property by inverting a （n ＋ 1） × （n ＋ 1） matrix instead of the potentially much large 1 × 1 matrix in the algorithm via the Sherman- Morrison-Woodbury identity. This indicates that the algorithm can be implemented efficiently in practice. Preliminary numerical experiments on benchmark datasets show that the algorithm has good performance on both accuracy and training speed.

同期刊论文项目