提出了一种核主成分分析(KPCA)特征提取结合支持向量回归机(SVR)的红外光谱混合气体组分定量分析新方法。首先将特征吸收谱线严重重叠的混合气体光谱通过非线性变换映射到高维特征空间,然后在特征空间中再利用主成分分析法提取主成分,提取出的主成分作为SVR的输入建立校正模型,实现了甲烷、乙烷、丙烷、异丁烷、正丁烷、异戊烷以及正戊烷七种组组分特征吸收光谱严重重叠的混合气体的定量分析。用KPCA-SVR所建模型对未知浓度混合气体的七种组分预测的RMSE(φ×10^-6)较仅用SVR模型预测的RMSE(φ×10^-6)降低了一个数量级。结果表明,核主成分分析法具有很强的非线性特征提取能力,可以充分利用全光谱数据并有效地消除光谱数据噪声,降低数据维数,与支持向量回归机结合可以提高红外光谱分析的精度,缩短模型计算时间,是一种有效的红外光谱分析新方法。
In the present paper, the authors present a new quantitative analysis method of mid-infrared spectrum, The method combines the kernel principal component analysis (KPCA) technique with support vector regress machine (SVR) to createa quantitative analysis model of multi-component gas mixtures. Firstly, the spectra of multi-component gas mixtures samples were mapped nonlinearly into a high-dimensional feature space through the use of Gaussian kernels. And then, PCA technique was employed to compute efficiently the principal components in the high-dimensional feature spaces. After determining the optimal numbers of principal components, the extracted features (principal components) were used as the inputs of SVR to create the quantitative analysis model of seven-component gas mixtures. The prediction RMSE (φ × 10^-6)of seven-component gases of prediction set samples by use of KPCA-SVR model were respectively 124. 37, 72.44, 136.51, 87.29, 153.01, 57.12, and 81.72, ten times less than that by use of SVR model. The elapsed time of modeling and prediction by using KPCA-SVR were respectively 46.59 (s) and 4. 94 (s), which was consumedly less than 752.52 (s) and 26.21 (s) by using only SVR. These results show that KPCA has an excellent ability of nonlinear feature extraction. It can make the most of the information of entire spectra range and effectively reduce noise and the dimension of the spectra. The KPCA combined with SVR can improve the model's analysis precision and cut the elapsed time of modeling and analysis. From our research and experiments, we conclude that KPCA-SVR is an effective new method for infrared spectroscopic quantitative analysis.