为改进朴素贝叶斯(naive Bayes,NB)算法在识别未知恶意代码过程中学习速度慢的缺点,在分析研究朴素贝叶斯算法、复合贝叶斯(mu lti-naive Bayes,MNB)算法的基础上,提出了一种改进贝叶斯(half-increm entnaive Bayes,HNB)算法.算法采用特征集增量学习方式,在保证分类精度不降低的前提下,学习速度提高约30%.实际样本测试表明,分类精度达到了96%,其中对已知恶意代码的分类精度达到99%.
The detection of unknown malicious executables is beyond the capability of many existing detection approaches.Machine learning or data mining method can identify new or unknown malicious executables with some degree of success.Bayes or improved Bayes algorithm has the detection capability of unknown malicious excutables;however,it takes more time to study.A new improved algorithm is proposed in this paper.The new classifier based on strings achieve has high detection rates and can be expected to perform as well in real-world conditions.