针对数据挖掘过程中的数据库精炼问题,在分析现行属性约简方法的特点和不足的基础上,结合决策树算法操作简单、分类速度快的特点,通过知识的规则化描述以及规则族之间的相似性比较,建立了一种基于决策树的属性约简方法(简记为BD-RED),讨论了规则族之间的相似性度量的可释化构建问题,给出了BD-RED的具体实施策略,并结合实例分析了BD-RED的性能。结果表明,BD-RED具有良好的结构特征和较强的可操作性,可以有效实现不同决策理念下的属性约简,适合不同类型的大规模数据库的属性约简。
For the refinement of the database in data mining, by analyzing the characteristics and shortcomings of the current attribute reduction methods,and combining with it the features of simple operation and rapid classification of decision tree, the authors established a kind of attribute reduction method (BD-RED) based on decision tree using rule description of the knowledge and similarity measures between rules families. Further, we discussed the explanatory construction of similarity measure between rules families,gave the specific implementation strategy of BD-RED,and analyzed the performance through examples. The results show that BD-RED has a good structure and strong operability,and is an effective way to achieve attribute reduction under different consciousness, so it can be suitable to the large-scale attribute reduction.