基于谱的错误定位(SBFL)技术能找出导致程序出错的可执行代码.测试用例数目和覆盖语句次数可构造SBFL的二分型矩阵.利用该矩阵,人们提出许多的SBFL关联测度计算公式.然而,这些关联测度往往只适应部分程序集.因此,提出基于分类算法的技术,能学习到程序集特有的关联测度.训练集样本建立在成对的错误语句和正确语句上,其特征由语句对的条件概率相减而成.为证实技术的有效性,在Siemens套件、space和gzip三个基准数据集上完成实验.使用Weka的Logistic、SGD、SMO和LibLinear训练出的关联测度,性能都明显优于固定形式的SBFL测度.
Spectrum-based fault localization (SBFL) techniques aim at identifying the executing programs codes that correlate with failure. A dichotomy matrix for SBFL records the bivariate frequency distribu- tion of the test case results and the program element hit numbers. Given the matrix, many SBFL associ- ation measures are proposed to compute suspiciousness scores of the program elements. Research shows that any association measure can't be statistically better than other measures when localizing buggy pro gram. Therefore, a technique based the classification algorithm is proposed which can automatically learn the specific measure for a program set. A sample in training dataset is constructed by employing a pair of faulty statement and non-faulty statement ones, and its features are the probability features difference of two statements. It is evaluated with three benchmark datasets: Siemens suite, space and gzip. Experimental result indicates that the learned measures with LibLinear, Logistic, SGD and SMO of Weka outperformed existing SBFL association measures.