在垃圾短信过滤系统中,传统方法的短信息特征很难准确地考虑到词语的贡献。提出了一种新的特征词查找和特征构造方法,较好地反映了词语之间的关系和在短信中的贡献度。通过联合采用稀疏自编码器和支撑矢量机(SVM)进行学习和分类仿真实验,结果表明过滤效果比目前报道的类似分类器效果有显著的提升和改进。
In the spam messages filtering system,the traditional methods are difficult to consider the words contribution for short message feature correctly,so a new feature words search and feature construction method is proposed,which can reflect the relationship among the words and contribution in short messages. The simulation experiments of learning and classification were performed in combination with the sparse auto-encoder and support vector machine(SVM). The results show that,in comparison with the reported similar classifiers,the filtering effect of the proposed method is promoted and improved significantly.