特征选择在许多领域特具有重要的作用.本文将粗糙集方法和蚁群优化算法相结合,提出一种基于粗糙集蚁群优化方法的特征选择的算法.该算法以属性依赖度和属性重要度作为启发因子应用于转移规则中,用粗糙集方法的分类质量和特征子集的长度构建信息素更新策略.通过对数据集的测试,结果表明所提出的方法是可行的.
Feature selection has become the focus of research in the field of data mining,machine learning,pattern recognition and so on.Feature selection uses a more stable set and appropriate precision characteristics to describe the original feature set.Feature selection research has focused on two aspects: one is for the search strategy of the subset and the other is the performance evaluation feature subset.Therefore,the research on more effective feature selection algorithm,to obtain the better feature subset,to reduce the time complexity of the algorithm,and to find the fast feature selection algorithm,is still the focus of the study of feature selection.According to the defects and deficiencies of the current algorithm,by analyzing the advantages and disadvantages of the existing algorithms,the current shortcomings and deficiencies of methods have been found to propose a new method for feature selection which combined the rough set method and ant colony optimization algorithm.To improve the algorithm's performance,the core attribute as the start of the feature selection.In the transfer rules and the pheromone update strategy,this algorithm uses rough set dependency and attributes significance to guide the ants search process to improve the performance of the algorithm.In addition,the quality of classification based on rough set method and the length of the feature subset are used to measure the strengths and weaknesses of feature subset.By choosing a data set with certain number of data and attributes the proposed method is tested to compare with the feature selection method based on rough set and the feature selection method based on ant colony optimization.Testing and comparison results show that the proposed method is feasible and this method has obvious advantages in the indicators feature subset length and accuracy when the data set have core attributes.Finally,the given example and testing in real datasets show that the proposed method is effective.