属性约简是机器学习和知识发现的研究热点,而属性重要性度量则是构建属性约简算法的关键环节.针对不完备的混合型信息系统,在邻域关系下定义了一种新的属性集成重要性度量—–邻域组合测度,并据此提出一种基于邻域组合测度的属性约简(NCMAR)算法.通过多个UCI数据集上的实验表明,NCMAR算法不仅能够直接处理符号和数值属性共存的混合信息系统,而且适用于不完备信息系统,在获得较小约简结果的同时,能够保证较高的分类精度.
Atttribute reduction is a hot point in the machine learning and knowledge discover research, while the attribute importance measurement is the key link in the structure of the attribute reduction algorithm. For the imcomplete of the mixed information system, a new measurement method of attribute integration importance, named neighborhood combination measure, is defined under the neighborhood relation, and a neighborhood combination measure based attribute reduction(NCMAR) algorithm is also proposed. Some experiments are carried out on UCI data sets. And the experiments results show that the NCMAR algorithm can not only deal with mixed decision system with symbol data and numerical data, but is suitable for the imcomplete information system. What's more, it can obtain smaller reducts and better classification accuracy than current algorithms.