提出基于分类器可信度的权重计算策略,解决动态数据流集成分类中子分类器权重分配难题.该方法充分考虑了所处不同位置样本对权重计算的影响,利用信息熵描述分类器对预测结果的不确定性,建立分类器可信度与样本之间的关系,进而给出分类器可信度的定量计算方法.最后结合动态数据流分类需求和概念漂移特点,借助批量学习和时间遗忘策略构建基于分类器可信度的动态加权集成分类模型.理论分析和实验结果表明该分类方案可行,相比传统集中方法具有一定的优势.
A weight computation policy based on confidence is presented to deal with the problem in the sub-classifier's weight in dynamic data stream ensemble classification. The policy fully considers influence of the sample on the weight of the sub-classifier. Uncertainty of the prediction result is described by information entropy, and relationship between the classifier confidence and the samples established. Thus, the computation method of classifier's confidence is defined. According to the requirements of dynamic data stream classification and traits of concept drift, a dynamic weight ensemble model is built by batch learning and time policy. Theoretical analysis and experimental results show feasibility of the presented schema, which is better than traditional methods.