本文提出1种新的筛选生物标记物的方法——分类特征变量法(CCV)。该法是在偏最小二乘法(PLS)的原理上,建立的统计学方法,不但包含判别函数的信息,而且兼顾分类潜变量的信息,在生物标记物筛选过程中表现出优势。本文不仅阐述了CCV法的原理和计算方法,还对实际代谢组数据体系的应用过程进行了详细描述。针对气相色谱-质谱联用仪(GC-MS)获得的鼻咽癌病人和健康人的血清代谢指纹图谱数据,采用该法筛选潜在的生物标记物。得到19个变量,分别对应13种内源性代谢物,并与载荷矢量图法筛选得到的代谢标记物的判别能力进行比较。以2种方法各自筛选出的特征变量为输入数据,用偏最小二乘-线性判别分析(PLS-DA)和交互检验(CV)分别验证其分类判别能力和预测能力。结果表明,CCV明显优于目前常用的载荷矢量图法,是1种新的快速有效的生物标记物筛选方法。
Based on both discriminant function and latent variables,classified characteristic variable(CCV) were quite suitable to screen potential biomarkers.In this paper,the principle and the calculation of this method were elucidated.Gas chromatography-mass spectrometry(GC-MS) was applied to analyze serum profiles of nasopharyngeal carcinoma patients and health controls.Based on CCV method,potential biomarkers were screened.The effects were investigated by using the cross validation(CV) and PLS-LDA.The study showed that the correct rate based on CCV method was superior to which based on loadings plot method.