离散化是一个重要的数据预处理过程,在规则提取、知识发现、分类等研究领域都有广泛的应用。提出一种结合二元蚁群和粗糙集的连续属性离散化算法。该算法在多维连续属性候选断点集空间上构建二元蚁群网络,通过粗糙集近似分类精度建立蚁群算法适宜度评价函数,寻找全局最优离散化断点集。通过UCI数据集验证算法的有效性,实验结果表明,该算法具有较好的离散化性能。
Discretization is an important process of data preprocessing and has been widely applied in the research fields of rule extraction, knowledge discovery,and classification. A diseretization algorithm of continuous attribute based on bi nary ant colony and rough sets was proposed in this paper. The algorithm constructs binary ant colony network on the cut points set generated by multidimensional continuous attributes. Meanwhile, it searches global optimal discretization cut points set by using fitness function constructed with the accuracy of approximation classification of rough sets. To validate the effectiveness of the proposed discretization algorithm,it is applied to seven UCI data sets. And the experi mental results indicate that it has relative better performance.