特征提取是对光谱测量数据成分的分解、重组和选择的过程,它是光谱数据挖掘中的一个关键环节,不仅决定着后续处理的质量、效率、系统复杂度和稳健性,也关系到能够挖掘到什么知识和处理结果物理意义的可解释性。按照特征表达方式将已有方法分为3类:统计约简法,特征谱法和谱线法,并对这些方法的基本原理、适用性、优缺点及其在光谱数据挖掘中的应用作了综述和分析。另外,亦从方法的“时”、“频”分析能力方面探讨了不同方法的特点,例如,物理意义的易解释性、对波长定标畸变和流量定标畸变的敏感性等。
Feature extraction is the fundamental step in spectrum data mining, which determines both the quality of the mining results and the efficiency, robustness, complexity of the mining system. This work reviews the current state of celestial spectrum feature extracting methods, introducs the fundamental ideas, analyzes their superiorities, limitations and applicabilities. By extracting features, the measurements of a spectrum are decomposed, reorganized and selected. Based on the characteristics of information expression, we classify the available feature extraction methods into three categories: statistical reduction method, characteristic spectrum method, and spectral line method. Their applications in spectrum data mining are also introduced. For clarity, the statistical reduction method is further classified into the following four classes: principal component analysis (PCA), wavelet transform (WT), manifold learning and supervised methods. In addition, we also study such characteristics of these methods as timefrequency analysis, the interpretability of physical meaning, robustness to calibration distortion, robustness to outlier, etc.