在粗糙集理论基础上,提出一种增量式的垃圾邮件过滤方法。该方法将邮件样本的局部最小确定性作为阈值来控制规则产生,并在邮件识别过滤过程中增加了反馈环节,将错判和未识别样本作为增量样本进行再学习,动态调整邮件规则的置信度。根据阈值选择可信度较高的规则进行更新,从而减少了规则的个数,提高了样本的正确识别率,最后用实验证明了该方法的有效性。
An approach for incremental spam filtering based on the rough set theory is proposed,which takes the local min-imal certainty of a mail sample as a threshold value to control the generation of the rules. The method added a feedback link in the process of mail recognition and filtering to relearn the misjudged and unrecognized samples as the incremental samples,regu-late the confidence coefficient of the mail rules dynamically,and select the rules with higher confident degree to update the rules based on the threshold for reducing the number of rules,and improving the recognition rate of samples. At the end of the paper,the effective of the method is verified by experiments.