针对传统粗糙集理论难以处理数值型数据的特点,提出基于邻域熵的决策表特征约简方法.该方法通过引入邻域关系进行信息粒化,定义邻域熵概念,用来度量数值型数据的不确定性,证明邻域熵的单调性原理,提出基于邻域熵与分类精度加权的特征重要度概念,基于邻域熵单调性原理设计了两种启发式特征约简算法.理论分析与实例表明该方法是有效可行的.
In view of the fact that the classical rough set theory was difficult to deal with the real data, a feature reduction method was proposed based on neighborhood entropy in the decision table. By the definitions of neighborhood relation, each object in the universe was assigned with a neighborhood subset, called neighborhood granule. The concept of neighborhood entropy was defined to measure uncertainty of real data. The monotonicity of neighborhood entropy was proved. Furthermore, the combination of neighborhood entropy and classification accuracy was used to evaluate the significance of attributes and two heuristic feature reduction algorithms were constructed. Theoretical analysis and an example show that the reduction method is efficient and feasible.