应用可见/近红外吸收光谱技术对不同品源的香菇进行了无损鉴别研究。通过主成分分析法(PCA)将谱段为375-1 025 nm的光谱数据进行压缩和主成分提取,发现前3个主成分累计可信度可达94.37%,说明在三维空间建立样本鉴别模型是可行的。提出了一种将PCA和三维空间聚类相结合的方法,应用遗传算法确定了样本空间分割平面。遗传算法以同源样本的分割平面方程符号反向次数最小作为适应度函数。还建立了将PCA和BP神经网络相结合的比较模型。选取了195个样本,其中150个用于样本建模,其余45个用于检验模型预测能力。两个模型使用相同的建模集和预测集。结果表明,两个模型预测能力基本一样,准确率均高于91%。与BP神经网络相比,新方法更加直观简便,为仪器化鉴别提供了新途径。
The potential of visible/near infrared absorbance spectroscopy as a way for the nondestructive discrimination of various fragrant mushrooms was evaluated. First, the spectral data ranging from 375 to 1 025 nm were analyzed by principal component analysis (PCA) for data compression and space clustering. The resulting accumulative credibility of 94. 37% based on the first three principle components (PCs) was achieved. This signifies that it is possible to establish a model for the sample discrimination in three dimensional space. Then, a new method in which space division planes were established based on the 3-D PC score plot was proposed. Due to the irregular sample distribution, the division planes for sample discrimination were established through genetic algorithm (GA). The fitness function was evaluated based on the number of the samples that have wrong sign by the division plane function. The goal is to achieve the minimum of the fitness function. Various parameters were predetermined, including population size, selection method, crossover rate, mutation rate and iteration number. Three plane functions were conducted as the model for sample discrimination. In order to evaluate the prediction performance of the new model, another model based on PCA and 3-layer BP-ANN was created and brought into comparison. The three PCs were adopted as the input of the BP-ANN. The number of the neurons in the middle layer was optimized based on the calibration error. The output layer was encoded in binary number. In the test, a total of 195 samples were examined, in which 150 samples were selected randomly for model building and the other 45 for model prediction. Both models adopted the same calibration set and prediction set. The result indicated that the two models established by different methods had similar capability of sorting the same samples out of others. Both models featured more than 91% Of sample recognition rate. It can be concluded that while BP-ANN tends to solve high-dimension data analysis, the ne