近红外光谱分析技术作为一种快速、无损检测技术十分适用于真假药品现场鉴别。自编码网络作为当前机器学习领域研究的热点受到广泛关注,自编码网络是一种典型的深度学习网络模型,它比传统的潜层学习方法具有更强的模型表示能力。自编码网络使用贪婪逐层预训练算法,通过最小化各层网络的重构误差,依次训练网络的每一层,进而训练整个网络。通过对数据进行白化预处理并使用无监督算法对输入数据进行逐层重构,使网络更有效的学习到数据的内部结构特征。之后使用带标签数据通过监督学习算法对整个网络进行调优。首先对真假琥乙红霉素片的近红外光谱数据进行预处理以及白化预处理,通过白化处理降低数据特征之间的相关性,使数据各特征具有相同的方差。数据处理之后利用稀疏降噪自编码网络针对真假药品光谱数据建立分类模型,并将稀疏降噪自编码网络模型与BP神经网络以及 SVM算法在分类准确率及算法稳定性方面进行对比。结果表明对光谱数据进行白化预处理能有效提升稀疏降噪自编码网络的分类准确率。并且自编码网络分类准确率在不同训练样本数量下均高于 BP 神经网络,SVM算法在少量训练样本的情况下更有优势,但在训练数据集样本数达到一定数量后,自编码网络的分类准确率将优于SVM算法。在算法稳定性方面,自编码网络较之BP神经网络和 SVM算法也更稳定。使用稀疏降噪自编码网络对真假药品近红外光谱数据进行建模,能对真假药品进行有效的鉴别。
Near-infrared(NIR)As a fast and non-destructive testing technology,spectroscopy techniques is very suitable for pharmaceutical discrimination.Autoencoder network,as a hot research topic,has drawn widespread attention in machine learn-ing research in recent years.Compared with traditional surface learning algorithm models,Autoencoder network has more pow-erful modeling capability as a typical deep networks model.Based on the unsupervised greedy layer-wise pre-training,autoencod-er trains the network layer by layer while minimizing the error in reconstructing.Each layer is pre-trained with an unsupervised learning algorithm,learning a nonlinear transformation of the input of each layer which is the output of the previous layer.Pre-whitening process could get the inner structural features of the data more effectively.The supervised fine-tuning is followed with the unsupervised pre-training which sets the stage for a final training phase.The deep architecture is fine-tuned with respect to a supervised training criterion with gradient-based optimization.In this paper,firstly,the preprocessing step and pre-whitening transformation were used to treat near-infrared spectroscopy data of erythromycin ethylsuccinate,The pre-whitening transforma-tion would reduce the correlation of the features,which gave each feature the same variance.Experimental results showed that the pre-whitening process had improved the classification accuracy of Sparse Denoising Autoencoder (SDAE)effectively.The SDAE with two hidden layers combined with pre-whitening was used to build the classification model for the identification of counterfeit pharmaceutical.The BP neural networks was compared with SVM algorithm for the classification accuracy and mean absolute difference (MAD).SDAE algorithm had higher classification accuracy than BP neural networks which had the same network structure with the SDAE networks,and SDAE algorithm also performed better than the SVM algorithm when the train datasets achieved a certain amount.As to the gene