如何构造差异性大且精确度高的基分类器是集成学习的重点,为此提出一种新的集成学习方法——利用PSO寻找使得AdaBoost依样本权重抽取的数据集分类错误率最小化的最优特征权重分布,依据此最优权重分布对特征随机抽样生成随机子空间,并应用于AdaBoost的训练过程中.这就在增加分类器间差异性的同时保证了基分类器的准确度.最后用多数投票法融合各基分类器的决策结果,并通过仿真实验验证该方法的有效性.
It is an open issue how to generate base classifiers with higher diversity and accuracy for ensemble learning. In this paper, a novel algorithm is proposed to solve this problem---particle swarm optimization is used to search for an optimal feature weight distribution which makes the classification error rate of Waining data sample by the distribution in AdaBoost minimal. Then, the feature subspace is constructed according to the optimal feature weight distribution, which is applied into the training process of AdaBoost, Thus, the accuracy of base classifier is advanced; meanwhile, the diversity between classifiers is improved. Finally, major- ity voting method is utilized to fuse the base classifiers' results and experiments have been done to attest the validity of the proposed algorithm.