在综合考虑算法效率与效用性的基础上提出了一种新的有界半朴素贝叶斯分类(bounded semi-naiveBayesian classifier,BSNBC)算法。传统的SNBC仅能将两个属性构成一个组合属性,大大制约了SNBC的分类性能。BSNBC在一定程度上克服了SNBC的上述弱点,它能将最多K个属性组合成一个组合属性节点。IP算法与LP算法可用于学习BSNBC,但是它们的搜索过程带有一定的盲目性。提出的算法利用条件互信息将关联性大的属性组合在一起。实验证明了其有效性。
This paper put forward a new learning algorithm of BSNBC based on integrated consideration of algorithm efficiency and efficacy. The traditional SNBC algorithm could only combine two basic attributes to form a combined attribute, which restricted SNBC' s classifying performance greatly. To a certain extent, BSNBC could overcome such weakness, it could combine no more than K basic attributes into a combined attribute. Integer programming (IP) and linear programming (LP) methods could be used to learn BSNBC, but both their searching processes had some blindness. This algorithm had the capability of joining attributes with close relationship into a combined attribute by conditional mutual entropy. The experimental results show its efficiency.