当前,代码静态分析工具已被广泛应用于软件开发与安全测评中,这些工具可以对软件源代码或二进制代码进行分析,而无需执行它们.尽管静态分析工具可以发现其它测试方法难以发现的错误,但它们面临着同一个严重的问题:分析结果的误报率很高.在静态分析工具生成的警报中,许多警报都是虚假的,这些虚假的警报并不对应真实的安全漏洞或问题.在使用过程中,用户不得不消耗很多时间和资源,从众多的警报中把虚假的筛选出去,这大大降低了静态分析工具的可用性.本文提出一种针对静态分析工具的优化方法,将静态分析的结果与软件的版本历史综合考虑,为每一份静态分析的警报计算其优先级,优先级越高的警报,越有可能对应真实的安全漏洞或问题.在三个开源软件(Lucene,Cassandra,Hadoop)中,对本文方法进行了验证.实验结果表明,该方法可以把FindBugs静态分析工具的精确性分别提高23%,36%和25%.
Static code analysis tools are widely used today to analyze code without executing it,but they share a critical challenge:the low precision of reported warnings. Users have been suffering in the low precision of warnings, they must spend a lot of time sieving the warnings in order to identify the real defects out of false positive ones. In this paper, we propose a ranking approach for warnings issued by static analysis tools, based on the history of software revisions. We evaluated our approach in three open-source projects, Lu- cene,Cassandra and Hadoop,in which the warning precisions were improved by 23% ,36% and 25% respectively.