在软件同源性检测方法中,基于抽象语法树的比对方法能够有效地检测出基于代码全文拷贝、修改变量名、调整代码顺序等的抄袭手段,被广泛用于抄袭检测工具中。但基于抽象语法树的比对方法对于修改变量类型和添加无意义变量的抄袭手段束手无策。针对这种情况,提出了一种基于抽象语法树的改进思想,该思想通过剪去语法树中影响判断的叶子节点的手段来还原检测原文抄袭,能够达到有效检测修改变量类型和添加无意义变量等抄袭的目的。
Among the source code plagiarism detection algorithms used in software engineering, the algorithm based on abstract syntax tree (AST) can effectively detect those plagiarized cases of copying with no modification, modifying variable names and changing the source code sequence, but the algorithm can not detect the cases of modifying the variable type, adding no useful variables and so on. In this paper, we propose an improved algorithm based on abstract syntax tree, which is implemented by cutting out the syntax tree leaf nodes that may affect the judgment. This improved algorithm can positively detect the plagiarism cases described in the previous.