由于变种和多态技术的出现,恶意代码的数量呈爆发式增长。然而涌现的恶意代码只有小部分是新型的,大部分仍是已知病毒的变种。针对这种情况,为了从海量样本中筛选出已知病毒的变种,从而聚焦新型未知病毒,提出一种改进的判定恶意代码所属家族的方法。从恶意代码的行为特征入手,使用反汇编工具提取样本静态特征,通过单类支持向量机筛选出恶意代码的代表性函数,引入聚类算法的思想,生成病毒家族特征库。通过计算恶意代码与特征库之间的相似度,完成恶意代码的家族判定。设计并实现了系统,实验结果表明改进后的方法能够有效地对各类家族的变种进行分析及判定。
With the emergence of metamorphic and polymorphic technology,the account of malicious code is in explosivegrowth,most of which is the variant of previously encountered samples.In order to conquer this problem and focuson new types of virus,this paper presents an approach to determine the family of malicious code.It extracts the static featureof malware samples by using disassembly tools,and filters out characteristic functions through one-class support vectormachine.A family feature database is generated from those functions by adopting clustering idea.Unknown samples,after extracting characteristic functions,are compared to the content of database to determine their family.Experimentalresults show that it can effectively investigate the malicious code and classify variations into different malware family.