为了得到一个低误分类代价的特征子集,本文通过定义样本间的代价距离并将代价距离引入了现有的特征选择架构,把流形学习和代价敏感特征选择问题相结合得到了一个新的代价敏感特征选择方法,称之为基于流形学习的代价敏感特征选择算法。以前提出的代价敏感特征选择算法在选择特征的过程中只考虑到了特征与误分类代价的关系,并对特征一个一个的进行选择,而本文所提出的代价敏感特征选择算法同时考虑了特征与误分类代价的关系和特征之间内在的判别信息,从而提高了代价敏感特征选择效果。在六个现实世界数据集上的实验证明了本文所提出的算法效果优于现有的相关算法。
In order to get a low-cost subset of original features, we define the cost-distance among the samples and joint it to existing feature selection framework. We combine manifold learning into cost-sensitive feature selection model and develop a corresponding method, namely, cost-sensitive feature selection via manifold learning (CFSM). Most previous cost-sensitive feature selection algorithms rank features individually and select features just using corre- lation the between the cost and the features. Our cost-sensitive feature selection algorithm selects features not only using the correlation the between the cost and the features but also using the discriminative information implied within data to improve the features selection performance. Experimental results on different real world datasets show the promising performance of CFSM outperforms the state-of-the-arts.