研究表明,具有较大边际分别的组合分类器泛化性能更高.根据该结论,论文构造了一个新的基于边际的度量指标(MM)以充分考虑基分类器和组合分类器的分类能力,进而提出了一种新的组合分类器选择方法.该方法初始化组合分类器为空(或满),迭代的加入(或移除)具有最大(或最小)MM值的分类器,以降低组合分类器规模并提高它的分类准确率.在随机选择的24个UCI数据集上的实验表明,与其他一些高级的贪心组合选择算法相比,该方法具有更好的泛化能力.
For the ensembles with the same training error,the one with higher margin distribution on training examples often has better generalization performance. Based on this results,a new metric to evaluate the importance of an classifier was designed,and then a new ensemble selection method was proposed. This method initializes the ensemble to be full( or empty),and then iteratively adds into( or remove from) the ensemble the classifier with largest( or smallest) MM to reduce ensemble size and improves its accuracy. The experimental results on 24 data sets showed that,compared with other state-of-the-art greedy ensemble pruning methods,the method performed better generalization ability.