受生物免疫系统工作机制的启发,本文提出一种基于免疫原理的个性化Spam过滤算法.其主要思想是根据用户兴趣和邮件特征定义垃圾邮件社区,将各垃圾邮件归类于不同的垃圾邮件社区,抽取各个垃圾邮件社区的特征并用一组特征检测器来表示,检测时通过判断待检测邮件是否归属于某垃圾邮件社区来进行过滤.该算法是一个增量学习算法,能连续过滤垃圾邮件.算法中免疫学习与免疫记忆机制的采用不仅能提高垃圾邮件过滤的检出率与正确率还能加快邮件过滤的速度.文中通过测试实验和分析表明,本文算法的垃圾邮件过滤性能优于AISEC与Naive Bayesian算法.
With the inspiration from self-protection mechanism of biological immune system, an individual spam filtering algorithm based on immune principles is proposed. Firstly, the spam communities are defined according to the users' interests and the email features. Then all spams are classified into different spam communities. Secondly, the community features are extracted and represented by a set of feature detectors. Finally, the identification of a spam depends on whether the email can be classified into any spam community. The proposed algorithm is an incremental learning algorithm and it can continuously filter spam without retraining. The immune learning and immune memory mechanisms adopted in this algorithm improve not only the detectable rate and the accuracy rate but also the filter speed. Experimental results show that the algorithm is better than the AISEC algorithm and the Naive Bayesian algorithm.