[目的]充分利用多源网络评估数据和URL异常特征数据,研究提高钓鱼网站识别准确性的可行性方案。[方法]采用8种机器学习技术,对比研究网络评估数据与传统的URL异常特征数据在钓鱼网站识别中的性能,并融合两类数据研究进一步提高钓鱼网站识别准确性的可行性方案。[结果]在钓鱼网站识别中,相比于传统的URL异常特征,利用网络评估数据可以取得更好的识别效果。融合两类数据对于提高识别准确性有一定帮助。[局限]未考虑钓鱼网站与正常网站的数量存在严重的不均衡问题。[结论]充分利用多源网络评估数据和URL异常特征数据识别钓鱼网站的方法是比较合理和有效的,对后续相关研究具有一定的借鉴意义。
[Objective] This study aims to identify phishing websites more effectively with the help of online evaluation data and URL abnormal features. [Methods] First, we used eight machine learning techniques to compare the performance of various online evaluation data and URL abnormal features in identifying phishing websites. Then, we proposed a new method to improve the accuracy of the identification procedures. [Results] We found that the evaluation data had better performance than abnormal features of URL. Combining the two data sets could improve the identification performance. [Limitations] We did not consider the difference between the numbers of phishing sites and the good ones. [Conclusions] Online evaluation data and URL abnormal features could help us identify phishing websites effectively, which indicates the direction of future studies.