作为挖掘算法选择和评价的标准之一,数据集的分类不一致程度一直是分类规则研究中的一项重要内容.然而随着人们对不完备数据集数据挖掘的深入,建立在等价关系上的基于信息熵的评价方法已难以满足实际需要.文中在利用相似关系的基础上,结合证据理论,给出一种基于信任度与似然度的信息粒构建方法,同时构建了类似于不协调度和混淆度的系统分类不一致程度评价方法,并对其相关性质等进行分析与证明.由算例分析可以看出,文中研究结果能够较好地描述缺失环境下的系统分类不一致程度,同时当数据集不存在缺失时,该研究与以往研究具有相同结果.
As one of the principles to select and appraise data mining algorithm, the inconsistency measure of database received much attention in classification rules discovering. But the classical measure based on information entropy will not meet those needs with the further study of incomplete database since the requirement of equivalence relation may not be satisfied in such condition. This paper gives a method to found information granularity with belief and plausibility measure based on the similarity relation and evidence theory. At the same time, inconsistency measures which are similar to inconsistent and confusion degree of fuzzy entropy are proposed with the proving of their some character. From the proving and simulation, it shows the proposed method will give a well description of inconsistency in incomplete database, and when there is no data missing, it will gives a same result as the previous studies.