近红外光谱(NIRS)是一种间接分析技术,其应用需建立相应的校正模型。为了提高模型的解释能力、预测准确度和建模效率,需要对NIRS进行波长选择,优选最小化冗余信息。智能优化算法是以生物的行为方式或物质的运动形态为背景,经过数学抽象建立算法模型,通过迭代计算来求解组合最优化问题,其核心策略是以某种目标函数为标准,基于多元校正建模并以逐步逼近的方法筛选出有效的波长点。选用蚁群优化(ACO)、遗传优化(GA)、粒子群优化(PSO)、随机青蛙(RF)和模拟退火(SA)5种智能优化算法对烟叶总氮和烟碱近红外光谱数据进行特征波长选择,结合偏最小二乘(PLS)算法,构建了多个烟叶总氮和烟碱的校正模型,结果显示:所选用两个数据集的总氮最优模型分别为PSO-PLS和GA-PLS模型,烟碱最优模型分别为GA-PLS和SA-PLS模型,五种智能优化算法所建模型预测性能并非全部优于全谱PLS模型,但是通过智能优化算法进行波长选择后建立的PLS模型大大简化,模型的预测精度、可解释性和稳定性均有所提高。同时也对优选波长进行了解释和分析,烟叶总氮特征波长优选组合为4 587-4 878和6 700-7 200cm-1;烟叶烟碱特征波长优选组合为4 500-4 700和5 800-6 000cm-1,优选出来的特征波长具有实际物理意义。
Near infrared spectroscopy(NIRS)is a kind of indirect analysis technology,whose application depends on the setting up of relevant calibration model.In order to improve interpretability,accuracy and modeling efficiency of the prediction model,wavelength selection becomes very important and it can minimize redundant information of near infrared spectrum.Intelligent optimization algorithm is a sort of commonly wavelength selection method which establishes algorithm model by mathematical abstraction from the background of biological behavior or movement form of material,then iterative calculation to solve combinatorial optimization problems.Its core strategy is screening effective wavelength points in multivariate calibration modeling by using some objective functions as a standard with successive approximation method.In this work,five intelligent optimization algorithms,including ant colony optimization(ACO),genetic algorithm(GA),particle swarm optimization(PSO),random frog(RF)and simulated annealing(SA)algorithm,were used to select characteristic wavelength from NIR data of tobacco leaf for determination of total nitrogen and nicotine content and together with partial least squares(PLS)to construct multiple correction models.The comparative analysis results of these models showed that,the total nitrogen optimums models of dataset A and B were PSO-PLS and GA-PLS models.GA-PLS and SA-PLS models were the optimums for nicotine,respectively.Although not all predicting performance of these optimization models was superior to that of full spectrum PLS models,they were simplified greatly and their forecasting accuracy,precision,interpretability and stability were improved.Therefore,this research will have great significance and plays an important role for the practical application.Meanwhile,it could be concluded that the informative wavelength combination for total nitrogen were 4 587-4 878 and 6 700-7 200cm-1,and that for tobacco nicotine were 4 500-4 700 and 5 800-6 000cm-1.These selected wave