基于质谱技术的蛋白质组学数据分析,是识别新型生物标记物模式的有效手段。质谱仪检测的数据含有大量潜在信息,但数据很容易被系统误差和噪声污染。蛋白质组学质谱数据预处理的目的在于抑制噪声、数据简约和增加谱可比性等,是增强生物学相关信息的至关重要步骤。只依赖质谱仪中的软件进行数据预处理存在一定局限,需要额外工具辅助。从数据简约、谱线平滑、基线校正、标准化、谱峰提取与量化、谱峰联配等方面介绍典型的预处理技术,对预处理方法存在的问题进行讨论,并就发展趋势进行展望。
Mass Spectrometry based proteomies analysis is a powerful approach for identifying novel biomarkers patterns in biological samples. Though data produced by mass spectrometers contains potentially huge amount of information, they are often interferred by errors and noises due to sample preparation and instrument approximation. Preprocessing is crucial for mass spectrometry data in removing noises, reducing the amount of data, making spectra comparable, and allowing us to focus on the biologically relevant information. However, data preprocessing using the software of mass spectrometer is not enough, and needs some extra tools to assist. This paper introduced different classical techniques for spectra preprocessing, including data reduction, smoothing, baseline correction, normalization, peak detection and quantification and peak alignment as well. Finally, future works and some key problems about mass spectrum preprocessing were discussed.