采用近红外光谱线性分析技术实现对鱼粉蛋白的快速检测,选择合适的波长变量是提高模型预测精度的关键。主要目的是建立一种稳健、简单的多元线性回归(MLR)模型,通过研究基于特征峰值的投影技术实现参与建模的波长优选。特征物质在近红外光谱区域的吸收特征,以鱼粉一阶导数光谱的峰谷波长点作为出发点,依次采用逐步多元线性回归(SMLR)和连续投影线性回归(SPA-MLR)方法完成两度特征信息波长筛选,进一步对备选的波长变量执行显著性检验,最终确定近红外线性分析的特征信息波长组合。结果表明,近红外长波区域中优选出53个特征信息波长变量,能够提高鱼粉蛋白近红外定量模型的预测精度,简化了模型,从而提高了模型的适用性和稳健性。
With respect to near-infrared rapid detection by linear analysis, it is a key procedure to select the ap- propriate combination of wavelength variables for improving model accuracy on determination of protein content in fishmeal samples. Aiming to establish simple multiple linear regression (MLR) models for easy operation, a projec- tion technology was proposed based on informative peaks to accomplish wavelength selection. According to the near- infrared inherent feature of reserving target information in peaks, the peaks and troughs of the first derivative spectra were firstly identified, and then selected the informative wavelengths in dual level by successively using the stepwise multiple linear regression (SMLR) and successive projections linear regression (SPA-MLR). With the implemen- tation of the significance test on alternative variables ( wavelengths), the best combination of feature wavelengths was finally found. Model predictive results show that the 53 informative wavelengths selected from the long near-in- frared region is feasible to improve the predictive accuracy of quantitative models on protein content in fishmeal samples, simultaneously reducing the computational complexity of the full-range spectrum analysis.