提出了一种结合自编码网络(AN)流形学习和偏最小二乘(PLS)法的红外光谱建模方法AN-PLS。AN-PLS方法首先用AN算法对红外光谱数据进行非线性降维,再结合PLS建立回归模型。利用该方法建立了毛竹笋中不溶性膳食纤维含量的近红外光谱和中红外光谱回归模型。结果表明,用AN-PLS方法建立的回归模型,比用其他常用光谱数据预处理方法结合PLS及用单独PLS算法建立的模型具有更小的预测均方根误差RMSEP和更高的决定系数R^2,因此,AN-PLS具有较优的建模与预测能力,利用近红外光谱和中红外光谱技术结合AN-PLS建模,可实现毛竹笋中不溶性膳食纤维含量的准确测量。
Autoencoder network (AN) is a nonlinear dimension reduction manifold learning algorithm which can find out nonlin- ear low-dimensional manifold structure from high dimensional spectra data effectively. In the present paper, a nonlinear infrared (IR) spectra modeling method AN-PLS was proposed by combining AN and partial least squares (PLS) to reflect the nonlinear correlations existing between IR spectra and physicochemical properties of samples. In AN-PLS, AN and PLS were adopted to deduct the dimensions of IR spectra and build regression calibration model, respectively. The AN-PLS was then applied to correlate the near infrared (NIR) spectra and the mid infrared (MIR) spectra with the concentrations of insoluble dietary fiber in bamboo shoots. The results indicate that AN-PLS can predict the concentrations of insoluble dietary fiber in bamboo shoots with a lower cross validation RMS error (RMSECV) and higher determinative coefficient (R^2 ), than other common spectra data pre- processing methods combined with PLS or sole PLS. It can be concluded that AN-PLS can effectively model the nonlinear correlations between IR spectra and physicochemical properties of the samples. And it is feasible to accurately detect the concentrations of insoluble dietary fiber in the bamboo shoots by coupling NIR and MIR spectra with AN-PLS modeling method.