提出了一个Android恶意代码的静态检测系统Maldetect,首先采用逆向工程将DEX文件转化为Dalvik指令并对其进行简化抽象,再将抽象后的指令序列进行N-Gram编码作为样本训练,最后利用机器学习算法创建分类检测模型,并通过对分类算法与N-Gmm序列的组合分析,提出了基于3-Gram和随机森林的优选检测方法.通过4000个Android恶意应用样本与专业反毒软件进行的检测对比实验,表明Maldetect可更有效地进行Android恶意代码检测与分类,且获得较高的检测率.
A novel static Android malware detection system Maldetect is proposed in this paper. At first, the Dalvik instructions decompiled from Android DEX files are simplified and abstracted into simpler symbolic sequences. N-Gram is then employed to extract the features from the simplified Dalvik instruction sequences, and the detection and classification model is finally built using machine learning algorithms. By comparing different classification algorithms and N-Gram sequences, 3-Gram sequences with the random forest algorithm is identified as an optimal solution for the malware detection and classification. The performance of our method is compared against the professional antivirus tools using 4 000 malware samples, and the results show that Maldetect is more effective for Android malware detection with high detection accuracy.