研究离散化方案中断点数、粒度熵与分类精度之间的关系,证明了粒度熵随着断点数的增加而下降。设计了一种混合型的数值离散化算法来提供多种相容离散决策表。实验发现:粒度熵和分类精度之间的相关程度有时高于断点数和分类精度之间的相关程度。
This paper discusses the correlation between the number of cut points, granular entropy and classification accuracy in discretization. It is proven that granular entropy decreases if the number of cut points increases. A hybrid discretization algorithm is proposed to provide discretization schemes for studying these measures. The simulation experiments show that the absolute value of the correlation coefficient between number of cut points and classification accuracy is quite large, as it for granular entropy and classification accuracy. Sometimes, the correlation between the granular entropy and classification accuracy is smaller than that between the cut points and classification accuracy.