恶意代码同源判定对作者溯源、攻击事件责任判定、攻击场景还原等研究工作具有重要作用.目前恶意代码同源判定方法往往依赖人工分析,效率低下,为此,提出一种基于调用习惯的恶意代码自动化同源判定方法.该方法基于7类调用行为,使用数据挖掘算法构建作者编程习惯模型,基于频繁项离群检测算法计算同源度,利用K均值聚类算法选择同源判定阈值,进而实现恶意代码同源判定.实验结果表明,该方法具有99%以上的准确率和可接受的召回率.
Malware homology identification is useful for malware authorship attribution,attack scenario restoration,and so on. Current malware homology identification methods still rely on manual analysis,which is inefficient and time-consuming. In order to improve the effectiveness and efficiency,an automatic malware homology identification method is proposed. Based on 7-class calling behaviors,this method constructs a model of calling habits using data mining algorithms.Then it calculates the degree of homology based on Frequent Pattern Outlier Factor. Finally,it chooses the threshold values using k-means clustering algorithm to identify homology. The experimental evaluations on real-world malwares showour method achieves high accuracy( over 99%) and acceptable recall rate.