介绍了朴素贝叶斯垃圾邮件过滤算法,对于朴素贝叶斯算法中条件概率的计算,选用了多变量贝努里事件模型的计算方法,在多变量贝努里事件模型的基础上进行了改进,并在Ling-Spam语料库上进行实验,实验结果表明改进后的算法有效地提高了过滤器的召回率和精确率,并且降低了过滤器的错误率。
The paper describes the Nave Bayesian spam filtering algorithms.In terms of probability calculation of Nave Bayes algorithm,the paper selects calculation of multi-variable model of Bernoulli event,and makes improvements to multi-variable model of Bernoulli event,and carries out an experimental on the Ling-Spam corpus.The results show that the improved algorithm can effectively enhance the recall and accuracy of the filter and lower the error rate of the filter.