在文本分类中,应用支持向量机(SVM)算法能使分类在小样本的条件下具有良好的泛化能力,但支持向量机的参数取值决定了其学习性能和泛化能力.为提高支持向量机算法的性能,提出了一种支持向量机优化算法E-SM,引入信息熵来表征惩罚系数C,提出了加权系数,算法实现了SVM训练过程中参数的智能化,减少了对支持向量机参数选择的盲目性,减少了部分训练样本集数目,提高了SVM性能.实验表明,E-SVM算法较传统算法具有更好的分类精度和时间效率.
In the text classification field,using Support Vector Machines(SVM) algorithm can obtain a satisfactory generalization of classification under the condition of small samples.But the parameters of the Support Vector Machines decide its learning performance and generalization ability.To enhance the performance of Support Vector Machines(SVM) algorithm,E-SVM---an improved SVM support vector machines is proposed.The information entropy is introduced to characterize the punishment coefficient C,and weighted coeffient is also proposed.This improved algorithm has realized the parameters of intelligent in SVM training process,decrease the blindness in determining SVM parameters,reduce the number of training sample set and improve the performance of SVM.The experiment indicates that E-SVM is better than the traditional algorithm in accuracy and speed.