应用连续投影算法(successiveprojections algorithm,SPA)选择由主成分分析(principal component analysis,PCA)得到主成分的最佳组合。首先对奶粉的短波近红外光谱进行PCA分析,然后通过SPA得到的脂肪和蛋白质含量预测最佳主成分组合分别为主成分1,2,4,5,6和7以及主成分1,2,3,4,5和8。通过最小二乘支持向量机(Least-squares support vector machine,LS-SVM)对奶粉中脂肪和蛋白质含量进行预测,SPA选择得到的主成分组合均优于分别采用前4个到前8个主成分。基于SPA得到的主成分组合得到脂肪含量预测结果的确定系数(R^2P),预测误差均方根(root mean square error for prediction,RMSEP)和剩余预测偏差(residual predictive deviation,RPD)分别为0.9890,0.1703和9.5343。而蛋白质含量预测结果的R^2p,RMSEP和RPD分别为0.9876,0.1348和8.9274。说明SPA能够用于快速有效选取最佳的主成分数,寻优过程简单快速,并且不用对大量参数进行调试。
Successive projections algorithm (SPA) was employed to select the optimal combination of principal components (PCs) which were obtained by principal component analysis. Short-wave near infrared spectra of milk powder was firstly analyzed by PCA, and the optimal combination of obtained first eight PCs was determined by SPA. The optimal PC combination of fat content prediction was PC1, PC2, PC4, PC5, PC6 and PC7, and the combination for protein content prediction was PC1, PC2, PC3, PC4, PC5 and PC6. Least-squares support vector machine models inputted by different PC combination were established to predict fat and protein content, respectively. Both the fat and protein content prediction results of the PC combination selected by SPA were better than those of first four PCs to first eight PCs. R^2p, and root mean square errors for prediction and residual predictive deviation of prediction results of the PC combination selected by SPA were 0. 989, 0. 170 3 and 9. 534 3, respectively for fat, and 0. 987 6, 0. 134 8 and 8. 927 4 for protein. The overall results demonstrate that SPA can fast and effectively select the optimal PC combination. The selecting process is simple and does not need abundant parameter debugging.