在最大相关最小冗余(mRMR)属性选择方法的基础上,通过设置一个调节因子来改变类别相关性在属性选择中的影响程度,解决mRMR方法易于引入冗余属性的问题,提出一种类相关性影响可变选择性贝叶斯分类器(CCRI SBC).为克服人为指定属性个数易于导致的分类结果随意性,采用贝叶斯信息准则来自动确定最优属性个数.为使CCRI SBC能够处理含有连续变量的数据集,提出等频类别依赖最大化离散化方法,具有分类准确率高和离散化时间短的优点.UCI数据集的实验结果表明,本文方法能够有效处理离散和连续高维数据的分类问题.
A selective Bayesian classifier based on change of class relevance influence(CCRI SBC) was proposed by introducing a regulator factor into an attribute selection method,namely maximum relevance and minimum redundancy(mRMR).The regulator factor was used to change the influence degree of class relevance on the attribute selection,which can avoid the existence of redundant attributes in mRMR.In addition,a Bayesian information criterion was used to determine the optimal number of attributes automatically,which can overcome the randomness of classification results that easily caused by the setting number of attributes manually.In order to further make the CCRI SBC is applicable for continuous data,a discretization method,i.e.,equal frequency class attribute interdependent maximization was proposed,which has advantages of high classification correct rate and short discretization time.Experimental results on UCI datasets show that the proposed method can deal with the classification problem for discrete or continuous and high-dimensional data effectively.