在充分利用土壤类型、土地利用方式、岩性类型、地形、道路、工业类型等影响土壤质量主要因素,准确获取区域土壤质量的空间分布特征的基础上,采用互信息理论对13个辅助变量(岩性类型、土地利用方式、土壤类型、到城镇的距离、到道路的距离、到工业用地的距离、到河流的距离、相对高程、坡度、坡向、平向曲率、纵向曲率和切线曲率)进行筛选,然后通过决策树See5.0预测研究区土壤质量.结果表明:影响研究区土壤质量的主要因素包括土壤类型、土地利用方式、岩性类型、到城镇的距离、到水域的距离、相对高程、到道路的距离和到工业用地的距离;以互信息理论选取的因子为预测变量的决策树模型精度明显优于以全部因子为预测变量的决策树模型,在前者的决策树模型中,无论是决策树还是决策规则,分类预测精度均达到80%以上.互信息理论结合决策树的方法在充分利用连续型和字符型数据的基础上,不仅精简了一般决策树算法的输入参数,而且能有效地预测和评价区域土壤质量等级.
In this paper,some main factors such as soil type,land use pattern,lithology type,topography,road,and industry type that affect soil quality were used to precisely obtain the spatial distribution characteristics of regional soil quality,mutual information theory was adopted to select the main environmental factors,and decision tree algorithm See5.0 was applied to predict the grade of regional soil quality.The main factors affecting regional soil quality were soil type,land use,lithology type,distance to town,distance to water area,altitude,distance to road,and distance to industrial land.The prediction accuracy of the decision tree model with the variables selected by mutual information was obviously higher than that of the model with all variables,and,for the former model,whether of decision tree or of decision rule,its prediction accuracy was all higher than 80%.Based on the continuous and categorical data,the method of mutual information theory integrated with decision tree could not only reduce the number of input parameters for decision tree algorithm,but also predict and assess regional soil quality effectively.