采用太赫兹时域光谱系统,测量了7种不同浓度的血凝素蛋白及其与特异性抗体、无关抗体对照组反应的透射光谱,采用光谱预处理及主成分分析法,对多个太赫兹光谱参数进行分析.结果显示,主成分分析在数据降维的同时,可以突出数据的主要变化趋势;在原始变量相关性一致的条件下,约化吸收截面与血凝素蛋白浓度之间表现出最强的相关性,而介电损耗角正切值更适合于对血凝素蛋白一抗体复合物的聚类效果进行定性分析.该研究表明主成分分析法对于太赫兹生物光谱的分析及进一步研究蛋白质的结构和功能具有重要的指导意义.
One has proved that the collective structural vibrational modes of proteins are in the terahertz (THz) frequency range. These frequencies relate to the polypeptide backbone and are thought to be essential for conformational dynamics necessary for protein function. Hemagglutinin (HA) is the main surface glycoprotein of the influenza A virus. The H9N2 subtype influenza A virus is recognized as the most possible pandemic strain as it crosses the species barrier, infects swine and humans. In this paper we use principal component analysis (PCA) to study the 7 different concentrations dependent terahertz spectra of hemagglutinin proteins, and detect the binding interaction of HA with the broadly neutralizing monoclonal antibody F10 in liquid phase. Spectrum pretreatment and band selection play a vital role in the THz spectroscopic analysis due to the fact that the original spectrum contains a large amount of interference information. In order to compress variables and extract useful information, we use a variety of pretreatment methods, such as second derivative, multiplicative scatter correction (MSC), least square polynomial fitting derivation, standard normalization, smoothing, moving window median filtering before PCA analysis. We even consider MSC + smoothing + SG second derivative + median filtering as the optimized pretreatment method finally. THz spectrum parameters including refractive index, absorption coefficient, reduced absorption cross-section and dielectric loss angle tangent are calculated in a frequency range of 0.1-1.4 THz for comparison. The results indicate that the reduced absorption cross- section presents the highest correlation response to the concentration variation of HA protein, and the dielectric loss angle tangent appears to be more appropriate for qualitative analysis of HA-antibody binding interaction. PCA method provides a feasible and effective way to find the sensitive parameters for further analyzing the function of protein and the antigen-antibody interaction us