势支撑向量机P-SVM(potential support vector machine)作为一种新颖的封装型特征选择方法在许多领域得到了成功的运用,然而依据Fisher准则的基本原理发现势支撑向量机方法对应的目标函数只是类内离散度各类均值为0的一种特殊形式,从而使该方法的运用受到一定的限制.同时由于要求各类样本均值为0,一定程度上会导致在0矢量周围出现样本交叉,从而不利于P-SVM方法得到最优决策超平面,降低分类效果.因此利用一般的类内散度重新构造目标函数,提出一种广义的势支撑特征选择方法GPSFM(generalized potential support features selection method).GPSFM方法在一定程度上继承了P-SVM的优点,而且还具有特征选择冗余度低、选择速度快和适应能力强的特点,从而使得该方法表现出较之于P-SVM更好的特征选择和分类效果.实验结果表明该方法具有上述优势.
Feature selection is one of the fundamental problems in machine learning. Not only can its proper design reduce system complexity and processing time, but it can also enhance system performance in many cases. It becomes even more critical to the success of a machine learning algorithm in problems involving a large amount of irrelevant features. Potential support vector machine (P-SVM), as a new wrapper feature selection approach, has been applied to several fields successfully. However, according to Fisher linear discriminant criterion, it is found that P-SVM can work only when the mean of each class is zero, which makes it difficult to get best decision boundaries for sample data and therefore lowers classification capability of P-SVM. In this paper, based on the above mentioned finding, a new criterion function with general within-class scatter is adopted and a generalized potential support features selection method (GPSFM) is proposed, which not only has the advantages of P-SVM to some extent but also has the characteristics of low redundant features selection, high selection speed, and nicer adaptive abilities. So compared with the traditional P-SVM, this new method has much stronger abilities in both feature selection and classification. Our experimental results demonstrate the above advantages of the proposed method GPSFM.