数据 discretization 由机器学习方法贡献大部分到分类规则或树的正式就职。不平的集合理论是为 discretizing 的一个有效工具连续信息系统。此处,一个新方法被建议改进那些典型不平的集合为数据 discretization 基于启发式的算法,由利用决定信息减少候选人切割的规模,并且由相当测量有切割选择概率的一个新概念的切割意义的更多。模拟基于不平的集合理论与另外的典型 discretization 算法相比表明那,建议方法对 discretize 更有能力、有效连续信息系统。当静止时,它能有效地改进信息系统的预兆的精确性概念上保留他们的一致性。
Data discretization contributes much to the induction of classification rules or trees by machine learning methods.The rough set theory is a valid tool for discretizing continuous information systems.Herein,a new method is proposed to improve those typical rough set based heuristic algorithms for data discretization,by utilizing decision information to reduce the scales of candidate cuts,and by more reasonably measuring cut significance with a new conception of cut selection probability.Simulations demonstrate that compared with other typical discretization algorithms based on the rough set theory,the proposed method is more capable and valid to discretize continuous information systems.It can effectively improve the predictive accuracies of information systems while still conceptually keeping their consistency.