C4.5算法是基于信息熵理论进行数据分类分析的经典决策树数据挖掘算法。它主要包括数据预处理、决策树生成、决策树修剪、决策树规则提取等步骤。笔者将C4.5算法应用于森林资源二类调查的数据分析中,通过对调查数据挖掘分析表明,数据挖掘在森林资源调查数据分析中具有广泛的应用前景。
C4.5 algorithm is a classic decision-tree-based data-mining algorithm. It classifies and analyzes data based on information entropy theory. It mainly includes such following steps as pre-procession of data, creation of decision-tree, pruning of decision-tree, rules extraction from decision-tree. C4.5 algorithm is introduced in this paper, and it is applied in a data analysis on second-stage investigation of forestry resources. The promising prospect of data mining in investigation and data analysis of forestry resources has been proved.