海量数据的存在是现代信息社会的一大特点,如何在成千上万的基因中有效地选出样本的分类特征对癌症的诊治具有重要意义。采用局部非负矩阵分解方法对癌症基因表达谱数据进行特征提取。首先对基因表达谱数据进行筛选,然后构造局部非负矩阵并对其进行分解得到维数低、能充分表征样本的特征向量,最后用支持向量机对特征向量进行分类。结果表明该方法的可行性和有效性。
The existence of mass data is one of the main features of modern information society,and it is very significant for using classification method to analyze tumor gene expression profiles.This paper proposes an algorithm for obtaining feature extraction of tumor gene expression profiles by utilizing local non-negative matrix factorization.The whole process is done by first putting tumor gene expression data into strata.Then,several eigenvectors are obtained by constructing and decomposing local non-negative matrix.These eigenvectors,the dimension of which is low,can fully express the features of the sample data.Finally,the eigenvectors are classified using support vector machine.The feasibility and effectiveness of this algorithm have been proven by experimental results.