针对当前传统静态恶意网页检测方案在面对海量的新增网页时面临的压力,引入了两段式的分析检测过程,并依次为每段检测提出相应的特征提取方案,通过层次化使用优化的朴素贝叶斯算法和支持向量机算法,设计并实现了一种兼顾效率和功能的恶意网页检测系统——TSMWD(two.stepmaliciousWebpagedetectionsystem)。第一层检测系统用于过滤大量的正常网页,其特点为效率高、速度快、更新迭代容易,真正率优先。第二层检测系统追求性能,对于检测的准确率要求较高,时间和资源的开销上适当放宽。实验结果表明,该架构能够在整体检测准确率基本不变的情况下,提高系统的检测速度,在时间一定的情况下,接纳更多的检测请求。
In view of the increasing number of new Web pages and the increasing pressure of traditional detection methods, the naive Bayesian algorithm and the support vector machine algorithm were used to design and imple- ment a malicious Web detection system with both efficiency and function, TSMWD, two-step malicious Web page detection. The first step of detection system was mainly used to filter a large number of normal Web pages, which was characterized by high efficiency, speed, update iteration easy, real rate priority. After the former filter, due to the limited number of samples, the main pursuit of the second step was the detection rate. The experimental results show that the proposed scheme can improve the detection speed of the system under the condition that the overall detec- tion accuracy is basically the same, and can accept more detection requests in certain time.