提出使用核K-means聚类算法从样本集中抽取特征向量集来训练SVM,达到减少SVM规模的目的。SVM核函数的选择会影响SVM模型的分类效果,提出将多个非线性映射能力不同的核函数进行线性组合,在特征训练集上构造出组合SVM的半定规划模型,用内点法求解出最优组合系数,得到非线性映射能力更强的半定规划SVM,并用做垃圾标签检测。在UCI数据集上与双层减样支持向量机方法进行比较,实验结果表明,新的垃圾标签检测法提高了识别率,并大幅度减少了训练时间。
This paper presented a method.It used kernel K-means clustering algorithm to extract the character vector set from the samples and got the optimal combinatorial coefficients of different functions to construct semi-definite programming SVM with stronger nonlinear mapping ability.Experimental results on UCI datasets show that compared with double-layer reduction method,the new method gives higher accuracy and speeds up obviously.