基因表达谱数据一般来源于临床试验,而在临床试验中,试验样本的类分布情况是不确定的,这就使得表达谱数据往往具有比较明显的不平衡性.采用加权极限学习机来对不平衡基因表达谱数据进行分类,为了减少因为不平衡数据引起的分类误差,一个临时的权重被分配给每一个样本以增强少样本类的影响,同时减少多样本类的影响,进而提高肿瘤分类的准确率.实验结果表明,所提方法能够提高少样本类的识别率,从而提高分类器的总体性能.
With the development of gene microarray technology, gene expression profiling becomes a significant method for identifying different types of canners. Microarray gene expression data is from clinical trials in general, where the class distribution of samples is changeable, which makes the expression data have a chance to become more imbalanced. In this paper,the weighted extreme learning machine (WELM) was used to classify the imbalance microarray gene expressing data. In order to reduce classification error caused by the imbalance data, a weight was assigned to each sample in order to enhance the impact of minority class while reducing majority class ’ s impact, and improve the accuracy of tumor classification. The experimental results show that the minority class recognition rate can be well improved by the proposed method, so as to improve the overall performance of classifiers.