根据混淆矩阵,采用层次聚类和混淆类别两种不同的策略构造文档类别层次结构,最后采用层次分类的方法进行实验.实验结果表明混淆类别策略优于层次聚类策略,对平面分类的查全率和查准率都有所提高.
The hierarchical clustering and confusion classification are used to construct a document-type hierarchical structure based on confusion matrix. The experimental results using hierarchical classification show that the performance of confusion classification excels that of hierarchical clustering, and the confusion classification improves the precision and recall of flat document classifier.