朴素贝叶斯分类算法以其简单、高效等优点一直是分类算法的研究热点之一。但是它的条件独立性假设不能很好的表现多数现实应用中变量之间存在的依赖关系,从而影响它的分类效果。针对这一问题,提出了一种改进算法,该算法通过基于协方差和卡方拟合统计量的思想来确定权重系数。实验结果表明,与朴素贝叶斯算法相比,对于分类正确率有一定的提高。
The Naive Bayesean algorithm has been one of the interested research fields of classifica- tion algorithm for its simple and high efficient. But the conditional independence assumption makes it unsuitable to describe the dependent relationship that exists in the variables in real-world applica- tions, thus affecting its classification results. An improved algorithm is presented that the weighting coefficient is determined based on the covariance and the chi-square fit statistics. The experimental results show that the classification accuracy are improved to some extent compared with the Naive Bayesean algorithm.