随着大规模基因表达谱技术的发展,基于基因表达谱的癌症诊断方法正在成为临床医学上一种快速有效的诊断方法,但是由于基因表达数据维数过高、样本量小、噪声大,使得正确提取有关癌症的特征基因成为关键。以结肠癌肿瘤的基因表达谱数据为例,提出了结合Fisher权函数、离散傅里叶变换和主成分分析的混合特征基因提取方法,以多元Logistic回归分析和贝叶斯决策作为分类器进行肿瘤分类检测。实验结果表明,该方法对于结肠癌数据集CV识别准确率高达96.80%。
With the large-scale development of the technology--gene expression profiles, the diagnostic method based on gene expression profiles is now becoming a quick and effective method in clinical medicine.But because of gene expression data's high dimension, small sample size and large noise, extracting the characteristic gene about cancer correctly becomes the key point.The gene expression data of colon tumor as an example, the mixed characteristic gene extraction method is put forward combining Fisher weight function, discrete Fourier transform and principal component analysis and takes multiple Logistic regression analysis together with Bayesian decision as classifier to do tumor classification and detection.The experi- ment results show that,the accuracy of 96.80% is achieved on CV recognition for colon cancer's data set using this method.