针对大量电子文档需要准确地进行多层次自动分类管理的现实需求,提出基于多重特征选择和多分类器融合技术的层次分类方法。通过引入可信度函数对单分类器效果进行评价,适时采用辅助分类器对较难分类的文档进行分类投票判决。实验结果表明,相对于单分类器,该方法无论在平面分类和层次分类语料上都获得了更好的分类精度,且具有较好的时间复杂性,有很好的实际应用前景。
In practice, many documents need to be accurately classified into multiple hierarchies. This paper intended to introduce a method of hierarchical classify, which was based on multiple feature selection and multiple classifiers. It was to evaluate the experimental result of single classifier using the reliability function, to classify the documents that were hard to be classified by voting. Auxiliary classifier should be used if required. The experiments show that the method used has better accuracy and cost less time on both flat classification and hierarchical classification corpuses. It has a good vision of applications.