样本选择可以提高模糊支持向量机训练速度并在一定程度上提高其抗噪能力,但存在有效样本选择困难和选样率高的问题,利用阴影集对模糊集的分析能力,提出一种新的基于阴影集的模糊支持向量机样本选择方法,将模糊集合划分为可信任、不可信任及不确定3个子集,仅在可信任和不确定子集中选样,并分别采用子空间样本选择和边界向量提取的方法选样.实验结果表明,该方法在保持分类器泛化能力的前提下可以有效降低选样率和训练时间.因该方法去除了样本中的不可信任数据,所以当训练样本中含有噪声时,还可以有效提高分类器的分类性能.
Sample selection can speed up the training of Fuzzy Support Vector Machine(SVM). However, it is difficult to select effective sample and the selection ratio is very high. This paper proposes a new sample se- lection method for Fuzzy SVM based on shadowed sets. We divide the fuzzy sets into three subsets, i.e. trust- able data sets, trustless data sets and uncertain data sets. The samples are only selected in trustable data sets and uncertain data sets by using the subspace selection algorithm and the border vector extraction method re- spectively. Experimental results show that the training time and selection ratio ~is significantly reduced without any decrease in generalization ability by using the samples chosen by the proposed method. Furthermore, it improves the prediction performance of the classifiers when the data sets contain noises.