针对现有多关系朴素贝叶斯分类器中存在的统计偏斜问题,扩展了语义关系图的定义,给出了一种新的统计计数方法,构建了相应得多关系朴素贝叶斯分类公式,形成了一种基于关系数据库技术的新的多关系朴素贝叶斯分类器。为高效进行关系表连接,采用元组ID传播方法对关系表进行虚拟连接。进一步提高分类准确率,基于互信息标准对属性进行剪枝。实验显示新的分类器具有良好的分类性能。
To avoid the statistical bias existing in the present multi-relational naive Bayesian classifiers, a new multi-relational naive Bayesian classifier named nMRNBC is proposed. First, the definition of semantic relationship graph is extended. Then, a new counting method towards relational individual is presented. Finally, the corresponding naive Bayesian formula is constructed. To achieve high efficiency, the tuple ID propagation method is adopted. To get better accuracy, the attribute filter criterion based on mutual information is used. Experiments show that the new classifier can get good performance.