针对多分类器系统差异性评价中无法直接处理模糊数据的问题,提出了一种采用互补信息熵的分类器集成差异性度量(CIE)方法。首先利用训练数据生成一系列基分类器,并对测试数据进行分类,将分类结果依次组合生成分类数据空间;然后采用模糊关系条件下的互补信息熵度量分类数据空间蕴含的不确定信息量,据此信息量判断基分类器间的差异性;最后以加入基分类器后数据空间差异性增加为选择分类器的基本准则,构建集成分类器系统,用于验证CIE差异性度量与集成分类精度之间的关系。实验结果表明,与Q统计方法相比,利用CIE方法进行分类器集成,平均集成分类精度提高了2.03%,分类器系统集成规模降低约17%,而且提高了集成系统处理多样化数据的能力。
A novel diversity measure method using complement information entropy(CIE)is proposed to solve the problem that the diversity estimation of multiple classifier systems is unable to deal directly with fuzzy data.A set of base classifiers is generated by using training data,and then is used to label test data.The outputs of the classifiers are reorganized into a new classification data space.Then the complement information entropy model is introduced under fuzzy relation to measure uncertainty information of the new space and the uncertainty information is used to estimate the diversity of base classifiers.Finally,an ensemble system is constructed based on the criterion that the ensemble diversity of the classifier set increases when a base classifier is added,and the ensemble system is used to validate the performance of CIE.Experimental results and a comparison with the Q-statistic method show that the average classification accuracy of CIE increases by 2.03%,and the number of ensemble classifiers reduces by 17%.Moreover,CIE also improves the ability of ensemble systems to process diverse data.