针对恶意代码数量呈爆发式增长,但真正的新型恶意代码却不多,多数是已有代码变种的情况,通过研究恶意代码的行为特征,提出了一套判别恶意代码同源性的方法。从恶意代码的行为特征入手,通过敏感恶意危险行为以及产生危险行为的代码流程、函数调用,应用反汇编工具提取具体特征,计算不同恶意代码之间的相似性度量,进行同源性分析比对,利用DBSCAN聚类算法将具有相同或相似特征的恶意代码汇聚成不同的恶意代码家族。设计并实现了原型系统,实验结果表明提出的方法能够有效地对不同恶意代码及其变种进行同源性分析及判定。
With the problem of the explosive growth of malicious code and many of the malicious samples are variations of previously encountered samples, this paper presents a novel approach to investigate the homology of malicious code based on behavior characteristics. To distinguish the variations of malicious code, it studies the malicious behavior of malwares,then computes the similarity of characteristics and the call graphs which are extracted by disassembly tools. It employs the clustering algorithms of DBSCAN to discover the family of malicious code. Experiments show that it effectively investigates the homology of malicious code and cluster variations into different malicious code family.