干旱半干旱地区湿地土壤中的有机碳是影响土壤质量,制约植物生长的重要因素之一,其含量的变化会影响生态系统的安全和稳定。为快速估测湿地土壤有机碳含量,在新疆艾比湖湿地保护区采集140个荒漠土壤样品,利用土壤可见/近红外光谱数据以及化学分析获取的土壤有机碳数据,在对土壤原始光谱反射率进行卷积平滑的基础上,获取了一阶微分、倒数对数一阶微分2种光谱预处理指标,采用蚁群-区间偏最小二乘法、基于支持向量机的回归特征消去法,选择土壤有机碳含量近红外光谱特征波长,在此基础上构建土壤有机碳含量偏最小二乘回归、支持向量回归模型。结果表明:1)利用原始一阶微分建立的模型,预测能力优于倒数对数一阶微分建立的模型。2)4种建模结果比较显示,利用原始一阶微分经基于支持向量机的回归特征消去法进行特征变量选择后建立的土壤有机碳含量模型,预测精度最高。训练集的相关系数以及均方根误差分别为0.9687、0.158%;测试集的相关系数和均方根误差分别为0.9091以及0.268%。因此,经过卷积平滑以及一阶微分预处理、并利用基于支持向量机的回归特征消去法建立的模型具有较高的预测精度和较好的稳健性,可以作为有效手段估算荒漠湿地土壤有机碳含量。
Soil organic carbon (SOC) is a critical soil property that has profound impact on soil quality and plant growth. It is involved in soil structural formation and atmospheric carbon sequestration. This is especially true in the arid and semi-arid regions. Accurately detecting SOC is an important issue. Traditionally, SOC is limited to laboratory determination using the techniques such as wet or dry combustion, ion sensing electrodes, loss on ignition, or via chemical assays. Yet those traditional approaches often involve expensive testing materials, time-consuming sample preparation and production of excessive environmental pollutants. An approach which can quantify SOC content with time and cost savings is needed. With 140 soil samples acquired from the Ebinur Lake wetland protection area in Xinjiang, China, this research attempts to apply 2 algorithms in hyperspectral data mining, namely, the ant colony optimization – interval partial least squares (ACO-iPLS) and recursive feature elimination – support vector machine (SVM-RFE) to improve the estimation accuracy of SOC content using the visible and near-infrared (VIS/NIR) spectroscopy of soils (350-2500 nm) in laboratory. After convolution smoothing (S-G), 2 common spectra pre-processing methods, namely, first order differential and first order differential of the logarithm of inverse, are applied in the hyperspectral data to extract the feature wavelengths. Results indicate that the feature wavelengths pertaining to SOC mainly are located within 1786-1929 nm with ACO-iPLS and 745-910, 1677, 1755, and 1911-2254 nm with SVM-RFE. With the extracted feature wavelengths, the ensuing models with the same 2 approaches are established with the half of the samples (70 soil samples) as training set and the other half (70 soil samples) as testing set. The results show that the spectra processed with the combination of the S-G and first order with reflectance perform much better than the logarithm of first order differential of the logarit