东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

运用近邻传播聚类分析进行SELDI-TOF蛋白质谱特征选择

ISSN号：0258-8021
期刊名称：《中国生物医学工程学报》
时间：0
分类：R318[医药卫生—生物医学工程;医药卫生—基础医学]
作者机构：[1]杭州电子科技大学生命信息与仪器工程学院,杭州310018, [2]浙江省肿瘤研究所,杭州310022
相关基金：国家自然科学基金（60801054,60801055）;国家杰出青年基金（60788101）

作者：杨合龙[1], 祝磊[1], 韩斌[1], 厉力华[1], 郑智国[2], 孟旭莉[2]

关键词：蛋白质质谱, 近邻传播聚类分析, 特征选择, 生物标志物, mass spectrometry, affinity propagation clustering, feature selection, biomarker

中文摘要：

针对如何有效分析高通量SELDI—TOF质谱数据以及筛选与肿瘤相关的蛋白质位点，提出一种基于近邻传播聚类分析的特征选择方法。首先利用t—test对SELDI数据进行初筛，然后利用近邻传播聚类分析以及零空间LDA对数据进行降维和去相关处理，最后采用SVM—RFE进行特征选择，筛选出与肿瘤判别相关的蛋白质位点。利用SVM、KNN、NB及J4．8等4个分类器，估算算法的分类性能。结果表明，在卵巢癌公共数据集OC—WCX2a和OC—WCX2b以及浙江省肿瘤医院乳腺癌数据集BC．WCX2a上显示该算法，在上述3个数据集中分类率分别达到96．43％、99．66％、90．88％，敏感性分别达到97．00％、100％、96．17％，特异性分别达到95．85％、99．08％、81．92％，并分别挑选出与肿瘤判别相关的10个蛋白位点。所提出的算法能够获得较好的分类率，有效提取出具有较好判别效果的蛋白质谱位点，有助于癌症的辅助诊断。

英文摘要：

To analysis high throughput and high resolution mass spectrometry data effectively and capture the cancer related protein feature from the mass spectrometry data, diagnosis called a feature selection based on affinity propagation clustering of mass spectrometry was proposed in this paper. Firstly, the t-test was used on mass spectrometry data, followed by feature selection based on affinity propagation clustering. Next, affinity propagtion clustering and NS-LDA was used for reducing dimensions and correlation. Thirdly, SVM-RFE was used to select the features. Finally, we used four classifiers to estimate the performance of the algorithm. The proposed method was tested and evaluated on the ovarian cancer database OC-WCX2a, OC-WCX2b, and breast cancer database BC-WCX2a. Classification was achieved 96.43 % , 99.66 % and 90. 88 % , sensitivity was achieved 97.00 %, 100 % and 96. 17 %, specificity was achieved 95.85 %, 99.08 % and 81.92 %, respectively. And 10 m/z features were selected for each dataset. The experimental results showed good performance of the method, and the method is expected to be used in cancer diagnosis.

同期刊论文项目