函数型数据的特征选择是从庞大的函数信息中选出那些相关性小、代表性强的少部分特征,以简化后期分类器的计算,提高泛化能力.由于特征选择方法用于函数数据分类效果并不理想,文中提出面向函数型数据的结合主成分分析法和最小凸包法的快速特征选择(FFS)方法,可以快速获得稳定的特征子集.此外,考虑到特征之间可能存在相关性,将FFS的结果作为其它方法的初始特征子集,故融合FFS与条件互信息方法.在UCR数据集上的实验证明FFS的有效性,并通过对比实验给出在不同时间代价和分类精度需求下的方法选择策略.
Feature selection of functional data aims to choose those features slightly correlated and strongly representative, from the huge functional information. And it can simplify the calculation and improve the generalization ability. Traditional feature selection methods are directly applied in functional data, and the results are not effective or efficient. A functional data oriented fast feature selection (FFS) method integrating principal component analysis(PCA) and minimum convex hull is proposed in this paper. FFS can obtain stable subset of features fleetly. Considering the correlation embedding in features, the result of FFS can serve as initial feature subset of other iterative approaches. This means twice feature selection will be needed. As a popular feature selection method for functional data, conditional mutual information (CMI) is adopted. The experiment results on UCR datasets demonstrate the effectiveness of FFS, and a selection strategy under different demands of time cost or classification accuracy is given through the contrast experiments.