基于程序频谱的缺陷定位方法可以有效地辅助开发人员定位软件内部缺陷,但大部分已有自动化方法在解决多缺陷定位问题时表现不佳,部分效果尚可的方法因复杂度较高或需要开发人员较多交互而仍需进一步改善.为改善上述问题,提出一种基于遗传算法的多缺陷定位方法 GAMFal,具体来说:首先基于搜索的软件工程思想对多缺陷定位问题进行建模,构建了候选缺陷分布的染色体编码方式,并基于扩展的Ochiai系数计算个体的适应度值;随后使用遗传算法在解空间中搜索具有最高适应度值的候选缺陷分布,在终止条件被满足后返回最优解种群;最后根据这个种群对程序实体进行排序.这样开发人员可以依次对程序实体进行检查并最终确定多个缺陷的具体位置.实证研究以Siemens套件中的7个程序和Linux的3个程序(gzip、grep和sed)作为评测对象,并扩展传统的定位方法评测标准EXAM至EXAMF和EXAML,通过与其他经典的缺陷定位方法(Tarantula、Improved Tarantula及Ochiai)进行对比,并通过Friedman检测和最小显著性差异测试可得,提出的GAMFal方法在整体定位效率方面优于传统方法,且需要更少的人工交互.除此之外,GAMFal的执行时间也在可接受的范围之内.
Spectrum-Based fault localization techniques are attractive for their effectiveness, and previous works have demonstrated that they can assist programmers to locate faults automatically. However, most of them can only work better when there is single bug than multiple bugs. Other approaches, although partially successful on multiple faults problem, are complex and need more human intervention. To better address these problems, this paper proposes a new spectrum-based fault localization technique based on genetic algorithm, called GAMFal, which can locate multiple bugs effectively with less human intervention. First, the multiple bugs' localization is converted into a search based model and a candidate expression for multiple bugs' location is encoded as an individual binary string. Then, the new approach extends the Ochiai coefficient to calculate the suspiciousness value used by genetic algorithm as a fitness function to search for a best population composed by optimal fault location candidates with highest suspiciousness value, and converts the ranking list of candidates to a checking order of program entities. According to this order, programmers finally examine program entities to locate faults. An empirical study on Siemens suites and three Linux programs(gzip, grep and sed) is conducted to compare GAMFal with other spectrum-based approaches. The Friedman test and Least Significance Difference method are then carried out to investigate the statistical significance of any differences observed in the experiments. The result suggests that the proposed method outperforms other related techniques in some respects and is feasible with respect to running time.